Single-copy Routing

Download Report

Transcript Single-copy Routing

ATCN: The Mathematics of Social
Networks (15/12/08)
Thrasyvoulos Spyropoulos (Akis)
[email protected]
Complex Networks
Internet
Paper X
Paper Y
Paper A
Paper B
Paper C
Paper D
Article Citations
2
Complex Networks (2)
 A complex network is a graph
 What are the statistical
properties of this graph?

Node degrees?

Path length?
 How can we model this graph?
 Why care about complex
networks?

Connectivity (epidemiology,
resilience)

Spread (information, disease)

Search (Web page, person)
3
Complex Network Measurements
 Measurements of Internet connectivity (traceroute maps), Citation
network, Protein network, IMDB actor collaborations.
Metrics:
 Path length: how many hops between two
nodes on average?

1
 dij
n(n - 1)/2 i j
 Degree distribution: probability p(k) of a node
having k neighbors
 Clustering coefficient (transitivity): If links A-B
and B-C exist: probability that A-C exists?
C
(1)
 triangles centeredat node i

 triples centeredat node i
i
i
4
Measurement Findings: Path Length
 Milgram’s experiment => Small World Phenomenon
 Short paths exist between most nodes: Path length l << total nodes N

Line network: path length l = O(N)
“Small world” = avg. path length l is at most logN
M. Newman
(2003)
5
Measurement Findings: Degree Distribution
 The degree distributions of most real-life networks follow a power law
p(k) = Ck-α
 Right-skewed/Heavy-tail distribution

there is a non-negligible fraction of nodes that has very high degree (hubs)

scale-free: f(ax) = bf(x), no characteristic scale, average is not informative
 Power-law distribution gives a line in the log-log plot
log p(k) = -α logk + logC
log frequency
frequency
degree
α
log degree
6
Measurement Findings: Degree Distribution
Taken from [Newman 2003]
7
Collective Statistics (M. Newman 2003)
α : power-law exponent (typically 2 ≤ α ≤ 3)
8
Measurement Findings: Transitivity/Clustering
 Is the friend of your friend, your friend also?
 Clustering coefficient: density of triangles (local clusters) in the graph
C
(1)
 triangles centeredat node i

 triples centeredat node i
i
i
 C = Prob (link A-C exists | A-B, B-C exist)
 If C > Prob(link A-C exists) => network is clustered
 In a random network, C → 0 as N → ∞
1
4
C
3
2
(1)
3
3


1 1  6 8
5
9
Communities
 Community: higher link density between nodes
 Between communities: lower link density
10
Measurement Findings: Transitivity/Clustering
11
Summary of Findings
Most real networks have…
1. Short paths between nodes (“small world”)
2. Transitivity/Clustering coefficient that is finite > 0
3. Degree distribution that follows a power law
Q1. Can we design graph models that exhibit similar
characteristics?
Q2. Can we explain how/why these phenomena
occur in the first place?
12
Random (or Poisson or Erdös-Renyi) Graphs
 The basic Gn,p (Erdös-Renyi) random graph model:

n : the number of vertices

0≤p≤1

for each pair (i,j), generate the edge (i,j) independently with probability p

p(degree = k): p(k)  B(n; k; p)   pk 1 pnk

Average node degree z = np
n
k 
 Simple: very well studied in the literature
 (average) path length l is logn/logz => Small World

Very short paths exist between any nodes. Why??

Number of nodes k hops far = zk => zl = n
13
Random Graphs (2)
 Is path length enough?
Degree Distribution:
zk  z
 As N → ∞ the degree distribution is Poisson p(k)  P(k; z)  e
k!

Highly concentrated around the mean

Probability of very high node degrees is exponentially small

Very different from power law!
Clustering Coefficient:
 The probability of two of your neighbors also being neighbors is p,
independent of local structure

clustering coefficient C = p

when z is fixed C = z/n =O(1/n)
No Navigability:
 Cannot find a node using local criteria (remember Milgram?)
14
Small World Graphs

Short paths must be combined with

High clustering coefficient

Ease of navigation: find a short path using only local criteria
Watts and Strogatz model [WS98]
 Start with a ring, where every node is connected to the next k nodes
 With probability p, rewire every edge (or, add a shortcut) to a random node
order
p=0
randomness
0<p<1
p=1
15
Small World Graphs (2)
Clustering Coefficient – Characteristic Path Length
log-scale in p
When p = 0, C = 3(k-2)/4(k-1) ~ ¾
L = n/k
For small p, C ~ ¾
L ~ logn
16
Scale-free Graphs: What About Power Laws?
 The configuration model

input: the degree sequence [d1,d2,…,dn]
4

1
3
2
process:
- Create di copies of node I; link them randomly
- Take a random matching (pairing) of the copies
• self-loops and multiple edges are allowed
But: Too artificial!
17
Preferential Attachment in Networks
“The rich get richer”
 First considered by [Price 65] as a model for citation networks

each new paper is generated with m citations (on average)

new papers cite previous papers with probability proportional to their
indegree (citations)

what about papers without any citations?
- each paper is considered to have a “default” citation
- probability of citing a paper with degree k, proportional to k+1
 Power law with exponent α = 2+1/m
18
Barabasi-Albert model
 The BA model (undirected graph)


input: some initial subgraph G0, and m the number of edges per new
node
the process:
- nodes arrive one at the time
- each node connects to m other nodes selecting them with
probability proportional to their degree
- if [d1,…,dt] is the degree sequence at time t, the node t+1 links to
node i with probability
d
d
i
d
i i

i
2mt
 Results in power-law with exponent α = 3
 Various Problems: cannot account for every power law observed
(Web), correlates age with degree, etc.
19
Processes on Social/Complex Networks
 Percolation

Connectivity of graph as nodes/links are added or removed

Network resilience to attacks, immunization, etc.
 Epidemiological Processes

Disease spread

Rumor spread
 Network Navigation

Search for a node/data using only local (“greedy”) algorithms

Design a network that facilitates fast navigation
20
Search in Scale-Free Networks
Locating a Piece of Data on a Network (e.g. Database, Web)
 Breadth-first Search

If data not available locally => ask all neighbors one-by-one

If a neighbors doesn’t have it => returns list of its own neighbors

Summary: Query 1-hop neighbors, then 2-hop neighbors, etc.
 Popularity-based Search (Adamic et al.)

If data not available locally => find neighbor with highest node degree

This neighbor takes over: looks for its own highest degree neighbor

Quickly find “hubs” => connected to large subsets of nodes
21
Milgram’s experiment revisited
 What did Milgram’s experiment show?

(a) There are short paths in large networks that connect individuals

(b) People are able to find these short paths using a simple, greedy,
decentralized algorithm
 Small world models take care of (a)
 Kleinberg: what about (b)?
22
Kleinberg’s model
 Consider a directed 2-dimensional lattice
 For each vertex u add q shortcuts

choose vertex v as the destination of the shortcut with probability
proportional to [d(u,v)]-r

when r = 0, we have uniform probabilities
23
Searching in a Small World: Algorithm

Given a source s and a destination t, the search algorithm
1. knows the positions of the nodes on the grid (geography
information)
2. knows the neighbors and shortcuts of the current node (local
information)
3. operates greedily, each time moving as close to t as possible
(greedy operation)
4. knows the neighbors and shortcuts of all nodes seen so far
(history information)
24
Small World (r < 2 or r > 2)
 “Long-range” contacts are chosen uniformly: r = 0 (Watts/Strogatz)
 Short paths (O(logn)) between any two nodes DO exist
 BUT: a local greedy algorithm (1-4) needs expected time (Kleinberg)
Ω(n2/3)
Generally:
- When r<2 a local greedy algorithm (1-4) needs expected time
Ω(n(2-r)/3).
- When r>2 a local greedy algorithm (1-4) needs expected time
Ω(n(r-2)/(r-1)).
25
Small World (r = 2)
 Short paths (O(logn)) between any two nodes exist
 (Kleinberg) When r=2, an algorithm that uses only local information
at each node (not 4) can reach the destination in expected time
O(log2n)
 Generalizes for a d-dimensional lattice, when r=d (query time is
independent of the lattice dimension)
- d = 1, the Watts-Strogatz model
 Why?

r > 2: “long-range contacts” are too “short” => too little progress

r < 2: “long-range contacts do not provide any “geographic cues”

r = 2: just the right navigational cues
26
Applications:
 Using Social Networks for Routing in Delay Tolerant Networks

Represent node contacts over time as a social graph

Navigate social graph
 Mobility Model using Social Networks

Use a social network as an overlay between “actors”/nodes

Determine mobility decisions based on social network
27
SimBet: DTN Routing Using Social Networks
 Mobility Model => (limited duration) contacts between nodes
 Convert contacts over time → Social Graph


1.
2.
3.
Contact between node A and B => social “link” between A and B
Algorithm:
Each node A keeps track of past contacts
For every node j that A has encountered: create a link A-j
Each node may exchange its connectivity table with other nodes it
encounters (or even other nodes’ connectivity tables)
28
SimBet: DTN Routing Using Social Networks (2)
 Exploit Social Network Analysis Techniques in order to:

Identify bridging ties
- Centrality

Identify clusters
- Similarity
Similarity: Number of common neighbors between two nodes
29
SimBet: DTN Routing Using Social Networks (3)
 Degree centrality

popular nodes in the network
N
C D ( pi )   a ( pi , p k )
k 1
 Closeness centrality

the distance of a given node to each node in the network
C C ( pi ) 
 Betweenness centrality

N 1
N
d( p , p
i
k 1
k
)
the extent to which a node can facilitate communication to other
nodes in the network
N
j 1
C B ( pi )  
j 1 k 1
g jk ( pi )
g jk
30
SimBet Utility Calculation
 Goal: to select node that represents the best trade off across both
attributes
Simn (d )
Sim Utiln (d ) 
Simn (d )  Simm (d )
Betn (d )
BetUtiln (d ) 
Betn (d )  Betm (d )
 Combined:
SimBetUtiln (d )  SimUtiln  BetUtiln
where
   1
31
Social Networks
 Graph model: Vertices = humans, Weighted Edges = strength of
interaction
32
Mapping Communities to Locations
 Assume a grid with different locations of interest

Geographic consideration might gives us the candidate locations
33
Mobility Between Communities
 pc(i) = attraction of node i to community/location c
p1(B)(t)
pC(i) 
w
jC
p2(B)(t)
ij
{j  C}
p3(B)(t)
34