free sex gallery

Download Report

Transcript free sex gallery

Scale-Free Networks:
Models, Structure, and Dynamic Processes
Doochul Kim
School of Physics
Seoul National University
(NCTS, NTU, Taipei, Sept. 13, 2004)
Collaborators: Byungnam Kahng, Kwang-Il Goh,
DS Lee, H Jeong, E Oh
Outline
I. Why scale-free network (SFN)?
– Introduction
II. How do we describe them?
– Characterization of structures
III.How do such forms appear?
– Models of SFN
IV.What can happen on them?
– Dynamic processes
V. Summay
1. Why scale-free network?
– Introduction
Complex systems
- We want to better understand interacting many-body systems.
- The interactions between the constituents are often too
complicated and specific to be modeled precisely.
- The patterns of interconnectedness are highly nontrivial.
Network
representation
Focuses on the topology of
the interconnections and
its effect on dynamics
Why graph and network?
Lattice
nodes  spins
links  interactions
Disordered systems
inhomogeneity
due to disorder
But how about social (small world) networks?
Collaboration Network
(after Newman and Park 2003)
Simplest model:
Random Graphs of Erdős-Rényi (ER) (1960)
- There are N vertices (nodes)
- Links are connected with equal probability
p/N
- Equivalently, links are randomly attached one by one until
there are L links.
- Degree distribution (Probability that a node has
Poissonian
e p p k
PD (k ) 
k!
k links) =
- Diameter (mean distance between two nodes) =
D ln N
(small world behavior)
- Percolation transition when p=1 (L=N/2) (appearance of
the giant component ).
JavaApplet
Small-World Networks (Watts & Strogatz, 1998)
Regular lattice + Small fraction of random rewiring.
Small diameter (~ logN)
+
High clustering coefficient
But are they good representations?
WWW
Protein interaction network
http://www.nd.edu/~networks/gallery.htm
Social network
Internet
Nodes: computers, routers
Links: physical lines
(Faloutsos, Faloutsos and Faloutsos, 1999) – Router degree distribution
Faloutsos, Faloutsos, Faloutsos (1999) – Internet AS degree distribution
World Wide Web
Albert, Jeong and Barabasi (1999) – nd.edu WWW
Scale-free networks (SFN)
For many real-world networks, the distribution of the number
of links a node has follows a power law,
pd(k) ~ k-g.
They are called the scale-free networks [Barabasi & Albert 1999].
More examples of scale-free networks:
- Protein-protein interaction network
- Metabolic networks
- Citation networks
- Collaboration networks
- Actor networks
- Sex partnership networks
Reported values of g
vs. system size
[From Dorogovtsev & Mendes, 2002]
The second moment of
degree diverges when
2g 3
Prototype model of SFN - BA model:
Growing SFN model by Barabási & Albert (1999)
A. Growth
Network is not static but evolves and grows with time.
N(t) = N(0) + t .
B. Preferential attachment:
Links are not made randomly but preferentially.
k
t 1
)
(
 (
t)
t 1
)
 k(


pd(k) ~ k-3
Error and attack tolerance of SFN:
Random failure vs. intentional attack
After Albert, Jeong, and Barabasi (2000)
Diameter of the SFN:
SFN is ultrasmall!
ìï log N
ïï
D ~ í log N / log log N
ïï
ïïî log log N
(g > 3),
(g = 3),
(2 < g < 3).
[Chung et al., Cohen et al., Dorogovtsev et al. (2002)]
Major questions
1. How do we characterize the structure of networks?
2. How does the specific form of the network emerge?
3. How does the structure of network affect the
collective behavior of physical processes occurring
on top of it?
II. How do we describe them? Characterization of structures
Graph theoretical terms (for sparse, undirected, nondegenerate graphs)
G
a = adjacency matrix element of a graph G :
(1 if () connected, 0 otherwise)
- degree
ki =
å
j
ai j (i = 1, L ,N )
- degree distribution:
N  Pd (k ) = # of vertices with degree k
- joint degree distribution:
 ai , j  2 L
i j
2 L  P2 (k , k ')= # of links connecting vertices
with degree
- cluster/component size distribution, giant component
- module/community
k, k’.
Network correlations
2-node correlations: Average neighbor degree function
knn(k)=
 k ' P2 (k , k ') / kPd (k ) / k


k'
3-node correlations; clustering:
Local clustering function C(k)
Local clustering coefficient C
C=
# of edges between neighbors
k(k-1)/2
no correlation; neutral
negative correlation;
disassortative
k
C(k)
knn(
k)
positive correlation;
assortative
modular clustering
hierarchical clustering
no clustering
k
Walks on SFN
Walk: A sequence of vertices, all the successive elements are
connected by an edge.
v1 = principal eignenvector of A={aij} (associate with
the largest eigenvalue).
n>>1,
y(n) 
l1nv,12
# of returning walks is
determined by the largest
eigenvalue and its eigenvector
Spectral property  Distribution of returning walks
Spectra of SFN
exponential decay
near the center
[Goh et al. 2001, Dorogovtsev et al. 2003]
power-law decay
near the edge
spectrum does not follow the semicircle law of the ER random graph
Fat tail:
 ( )  (2g 1)
l1 ~ N1/4 ~
kh1/2
Principal eigenvector
ER graph
BA model
eigenfunction is highly localized
at the hub
 1/2
~
k
Centralities : Which nodes are more important?
• Degree centrality: CD() =
Hubs are nodes with largest k
k.
’s.
• Closeness centrality: CC() = 1/S
d,.
d = distance = hops along the shortest path
• Betweenness centrality: CB() = S,k
g,k()/g,k.
gk = number of shortest paths between  and k .
(Wasserman
& Faust, 1994)
of the largest
• Eigenvector centrality: eigenvector
eigenvalue of the adjacency matrix A .
• Random walk centrality: Noh and Rieger (2003), Newman
Betweenness Centrality (BC) [Freeman, 1977]
g(k)  (fraction in the number of the shortest
paths between i and j that pass through k.)
“How much k-th node is influential to the communication
between i and j”
2
3
 Example:
the BC at k contributed by
the communication from i to j
is g(k) = 2/3
i
1
 Accumulate over all ordered pairs:
1
3
1
3
k
2
3
g(k) = S,
(k)
g
1
j
Load
How much burden would be given to the vertex when every pair
sends and receives data continually?: The Load [Goh et al. 2001]
lij(k)  (fraction of a unit packet sent from node i to node j
along the shortest paths, that pass through k, assuming
even division at branching points and accumulation at
merging points.)
Example:
load at k due to a packet
from i to j is
(

)
k
l 3 
1
2
i
1
 Accumulate over all ordered pairs:
1
4
1
2
k
3
4
lk   lk( i  j )
i, j
- Numerically similar to the Betweenness Centrality
1
j
Load distribution
[Goh et al. 2001]
The load distribution follows a power law for scale-free networks
g=2.2 ()
g=2.5 ()
g=3.0
()
g=4.0 ()
g=  ()
static model
ER graph
The load exponent seems to be robust as d  2.2(1), for 2
< g < 3.
cf. For scale-free trees, the load exponent is d=2.
[Szabo et al. 2002; Bollobas et al. 2004; Ghim et al. 2004]
Classification of SFN
[Goh et al. 2002]
SFN fall into two types according to the load distribution
d = 2.2(1)
Collaboration network, protein
interaction network, metabolic
network of eukaryotes, etc.
d = 2.0(1)
Internet, metabolic network of
archaea, WWW, etc.
Catalogue of real-world and model networks
Metabolic network of E. nidulans
Internet autonomous systems
WWW within www.nd.edu
Mass-distance relation
Mass density function
Diameter change (1)
How much would the diameter change
by the removal of a single vertex?
[J.H.Kim et al. 2003]
Diameter: Average separation between
vertex pair within largest component.
 A measure of efficiency
Before deletion
After deletion
Cascade in shortest pathway
configuration change
1
D =  dk
 ,k
0
D -D

D0
What is the typical value of the
diameter change?
Can we predict it?
Diameter change (2)
Diameter change distribution follows a power law for
scale-free networks
Fat tail for large D
static model
z = 2.2(1)
ER graph
Diameter change is mostly small;
96% are within 2x10-4
Diameter change (3)
Protein interaction network
of the yeast. z  2.3
Internet AS level.
z  1.7
Diameter change (4)
In scale-free networks, the “average” value of the
diameter change becomes meaningless, since its
variance diverges.
Can we predict the diameter change by the single vertex
failure?
Yes, for ER networks but
No, for scale-free networks.
 Scale-free network is a good example of complex
systems showing highly diverse responses to minute
perturbations.
III. How such forms appear? Models of SFN
Generalized BA model
 Growth : At each time step we add a new node with m edges
(connected to the nodes already present in the system).
 Preferential Attachment : The probability  that a new node
will be connected to node i depends on the connectivity ki of that
node
ki (t )  am
 i (t ) 
 j  k j (t )  am 
Pd(k) ~ kg
a)
(g
=
3 +
Hierarchical SFN model
[Ravasz & Barabasi 2003]
A growing network model that is both scale-free and hierarchical.
P(k) ~ k-g
C(k) ~ k-b
C(N) ~ const
Huberman-Adamic model
[Huberman & Adamic 1999]
Network growth as a multiplicative stochastic process
1. Exponential growth in AS numbers: N(t) = N(0)exp(at).
2. Geometric growth in AS degree: k(t) = k(t1)[1+g+x(t)].
2 with 2
3. Assume g = g0 and x(t) to beg
Gaussiang
r.v.
+2as



2
variance
s
.
+
0
2
2

pd(k)g=1s 
s 
~ k-g,
We measure them to be a = 0.029, geff = 0.016, seff = 0.14
 g  2.09
The Internet is
i) actively fluctuating in its connectivity;
ii) highly correlated.
Map of Internet IP addresses as of Nov. 23rd, 2003
http://www.opte.org
Asia Pacific - Red
Europe/Middle East/Central Asia/Africa - Green
North America - Blue
Latin American and Caribbean - Yellow
RFC1918 IP Addresses - Cyan
Unknown - White
The adaptation model (1)
An improved model for evolving Internet [Goh et al. 2002]
Adaptive rewiring
disconnected
One would not choose a worse partner than it abandoned.
The adaptation model (2)
Model vs data for
dead ends and the
hub
Internet as of January 2000
Huberman-Adamic model
Adaptation model
Protein interaction network (PIN)
- Physical interactions
between proteins occur in a
key-and-lock way.
- Protein structure is well
conserved during evolution,
based on which the
proteins can be classified
into families.
How the high clustering and the strong modular
organization appear?
 Family compatibility constraint
Hybrid model of PIN (1)
A model for evolution of the PIN: The compatibility
constraint
Protein family network
Protein interaction network
Duplication with
rate a
Mutation with
rate 1
Divergence with
probability d
Proteins can interact only with those in the
compatible families.
Hybrid model of PIN (2)
Snapshot
N = 600
NF = 62
Hybrid model of PIN (3)
Yeast protein interaction network
Hybrid network model
Without family constraint
Equilibrium ensembles of SFN
• Ensemble of graphs:
O =
å
P (G)O(G)
G
• Various equilibrium ensembles for SF networks
• Microcanonical ensemble with given degree sequence {k}.
Molloy and Reed construction (1995,1998)
• Canonical ensemble with fixed N & L.
• Grandcanonical ensemble with fixed N but with fluctuating L.
Energy models:
P(G )  exp( H (G ))
Park & Newman cond-mat/0405566
Farkas et al. cond-mat/0401640
Fitness (Hidden variable) models: Goh et al. (2001), Caldarelli et al.(2002)
Static model (1)
A generic model of SFN
1. To understand the generic properties of the SFN.
2. Useful substrate on which the various dynamic processes are
simulated.

1.
1.
2.
2.
3.
4.
Static model
[Goh et al. 2001; Lee et al. 2004]
There are N vertices labeled by the integer  (1    N) and assigned the
There are
labeled
by the
integer
 (1in the
 range
 N) (0,
and1).
assigned the
weight
wN= vertices
-m, with
m being
a real
constant
-m, with m being a real constant in the range (0, 1).
weight
At
eachwunit
time
duration, pick two vertices  and , with probabilities
=
At each
time
anP
edge
between
vertices  and  is realized with
P
=w/S
kwstep,
k and
=w
/Skwtwo
k, respectively.
probability
= pp, where
w/Skwotherwise,
= w/Sk
wk.
If
there is anPedge
and ,pdo
them.
 between
=nothing;
k and pconnect
Repeat 2-3 for KN=<L> times or until there are L links.

pd(k) ~ k-g, g =
JavaApplet
1+1/m.
Static model (2)
g=2
g=2
.5
g=3
g=5
g=
g=3
Static model (4)
• Prob( ai j =0)= (1- 2Pi Pj )NK = e
- 2 K NPi Pj
= 1- f i j
• Prob( ai j =1)= f i j = 1- e
Comments
- 2 K NPi Pj
– When m=0  ER case.
– Walker algorithm (+Robin Hood method) constructs networks in
time ON.
 N107 network in 1 min in a PC.
– Monte Carlo simulation with edge addition (deletion) prob. 
(1)  equivalent but inefficient.
 Such algorithm realizes a “grandcanonical ensemble” of
graphs G={a} with weights
P (G) =
Õ f i j Õ (1-f i j ) =
bÎ G
bÏ G
Õ f i j i j (1- f i j )
a
i> j
1- ai j
Static model (5)
k i = 2K NPi : i
- m
L =
,
å
f ij = K N ,
i> j
- 2 K NPi Pj
– Recall f i j = 1- e
Comments
– 2K NPi Pj : N 2m-1 /(i j )m
– When >3 (0m1/2),
2KNPP.
1
l = 1+
m
ln j
ln N
1
3
– When 23 (1/2m1)


0
2KNP
P

1
3
1
– Bosonic model (allow multiple links)
Prob(a=n) is Poissonian with a=2NKPP.
ln i
ln N
k i = 2K NPi : i - m
Static
L = åmodel
f = K N (6) Finite-size effect
ij
i> j
ìï K 2 N
l >3
ï
(mean # of triples ) : å f i j f i k : í 2 2/ (l -1)
ïïî K N
2< l < 3
i , j ,k
3
ìï
K
l >3
ïï
(mean # of triangles ) : å f i j f j k f ki : í 3(l - 1) 3(3-l )
ïï 2
i , j ,k
N2
2< l < 3
ïî K
Clustering coefficient
0
6(# of V)
10 (# of Ù)
100
10-2
10-2
10-4
10-4
10-6
10-6
10
10-8-8

2
3
10
5
3 
10
5 6
78
107
91
0
N
Ci
10-8

2
3
10
5
3 
10
5 6
78
107
91
0
N
Static model (7) Potts model representation
• Potts model  Bond percolation through Kasteleyn construction
Potts spin:

, q.
Hamiltonian:
s  1, 2,
H  2K N  Pi P j [ (s i , s j )  1]
i >j
Boltzmann weight:
e- H  [e
2 K NPi Pj
2 K NPi Pj
(1 e
) (s i , s j )]
i >j
  [(1 f i j ) f i j  (s i , s j )]
i >j
Partition function:
Z  q# of clust ers
Potts model order parameter:
 q (s i ,1)  1
i  q1 
  (1 f i j ) [f i j  (s i , s j )]
G bG
bG
  P (G)  (s i , s j )
G
bG

 giant cluster size (S= mN )
q1
H
Potts model susceptibility


q1
mean cluster size s  ’ s2 n (s) / N
s

Z
q
 L N = mean number of independent loops ( N )
q= 1
Static model (8) Potts model representation
Add h 0  [q (s i ,1)  1] to H :
i
Comments
1
N
 q (s i ,1)  1 
i  q1 


q1
H
Generating function of
the cluster size distribution P (s)=sn (s)/ N
1) H is NOT the Potts model on a scale-free network, but on the
complete graph as a tool to generate SF networks.
2) It is NOT the Hamiltonian defining the equilibrium ensemble
H    lnP (G)
In our case, H    a ln(e2K NPi Pj  1)  constant
ij
i >j
Static model (9) Thermodynamic limit
Exact analytic evaluation of the Potts free energy:
ur ö2
æ
H ~ ççå Pi si ÷
÷
÷
çè i
ø
1. Vector spin representation 
2. Integral representation of the partition function
3. Saddle-point analysis
å
•
Percolation transition at K c = (2N
•
Explicit evaluations of thermodynamic quantities.
Pi2 )- 1 « ák 2 ñ ákñ= 2
i
>
b=1
m
g=1
s
n= 3
l

.8
3 m
b=
1
l - 3
g=1
s
l
n=
l - 1
l - 3
3.
6
23
b=
m
1
3- l
s
n=
l
l - 1
3- l
2.

Static model (10) Percolation
Cluster size distributions
Branching process approach [cf. Newman et al. 2001] becomes
exact (almost no loops in finite clusters).
1/ s
1
s / sc
P (s)=sn (s)/ N ~s e
l > 4
3< l < 4
2< l < 3
t
5
2
2l - 3
l - 2
l
s
1
2
l - 3
l - 2
3- l
l - 2
• Giant cluster size at Kc(N)
 K

sc ~ 
 1
K c 
cf. Cohen et al. (2002)
for site percolation
S ~ N 1/( t - 1)
IV. What can happen on them? –
Dynamic processes
Dynamic processes on SFN
• Percolation: Cohen et al. (2000, 2001).
• Epidemic spreading: Pastor-Satorras and Vespignani (2001),
Newman
(2000), etc.
• Ising and Potts models: Dorogovtsev et al. (2002, 2003), etc.
• Random walks: Noh & Rieger (2004), etc.
• Synchronization: Motter et al. (2004), Moreno et al. (2004), Oh et al.
(2004), etc.
• Opinion dynamics: Stauffer et al. (2003), Fortunato (2004).
• Cascading failure: Watts (2002).
Sandpile model: Goh et al. (2003); OFC model: Zhou et al. (2004).
• Reaction-diffusion processes: Gallos et al. (2004).
Sandpile (1)
Sandpile on scale-free networks
[Goh et al. 2003]
1. Pile a grain at a random site v.
2. If the height of that site reaches
or exceeds the threshold zv, it
topples.
v
We set zv=kv in the following to
reflect the heterogeneity of SF
networks
Avalanche size s : # of sites that have toppled during the avalanche.
pa(s) ~ s-t,
tMF = 3/2.
Q: Will the critical behavior be affected?
Sandpile (2)
Sandpile avalanche vs. Branching tree
assumes that the avalanche does not form loops.
Avalanche size distribution  Branching Tree size distribution
with the appropriate branching probability qk
Sandpile (3)
Mapping to branching process
Branching probability qk :
The probability that a site will produce k progenies.
i) The avalanche reaches a vertex with degree k : qk’
ii) The vertex actually topples : qk’’
assuming the absence of correlation in degrees, and that
each height is equally probable [confirmed numerically].

Sandpile (4)
Avalanche size distribution in SF networks
Sandpile (5)
Power-law degree distribution can change the
critical behavior of the sandpile
t = 1.52(1)
t = 2.09(8)
g = 2.0
g = 2.2
g = 3.0
ER
Sandpile (6)
Critical behavior reduces to the mean-field type
when the threshold is uniform : zv = const = 2
t  1.5
ER
g = 3.0
g = 2.2
V. Summary
1. Scale-free networks are everywhere in Nature and it is a
fun to study it.
2. As a way to characterize the structure of SFN, spectra of
the adjacency matrix are discussed. Also, the load is
introduced as a measure of the centrality and is shown to
have robust power law behaviors.
3. The adaptation model for Internet and the hybrid model
for PIN are discussed. Also the static model is presented
and solved as a natural generalization of the ER random
graph.
4. The cascade dynamics on SFN are studied via the
sandpile model which show anomalous critical exponents
for 2<g<3.
5. Response of the complex system to a small perturbation
can only be described in probabilistic ways as exemplified
in the diameter change distribution of SFN.