Transcript G n

Optimization in very large graphs
László Lovász
Eötvös Loránd University, Budapest
December 2012
1
How is the graph given?
- Graph is HUGE.
- Not known explicitly, not even the number of
nodes.
Idealize: define minimum amount of info.
December 2012
2
How is the graph given?
Dense case:
cn2 edges.
- We can sample a uniform random node a
bounded number of times, and see edges
between sampled nodes.
Bounded degree ( d)
- We can sample a uniform random node a
bounded number of times, and explore its
neighborhood to a bounded depth.
December 2012
3
Algorithms for very large graphs
Parameter estimation: edge density, triangle density,
maximum cut
Property testing: is the graph bipartite? triangle-free?
perfect?
Computing a structure: find a maximum cut,
regularity partition,...
December 2012
4
The maximum cut problem
NP-hard, even with 6% error
Hastad
Polynomial-time computable
with 13% error
Goemans-Williamson
maximize
Applications: optimization, statistical mechanics…
December 2012
5
Max cut in dense graphs
How to estimate the density?
cut with many edges
Sampling O( -4) nodes
the density of max cut in the induced subgraph
is closer than
to the density of the max cut in the whole graph
with high probability.
December 2012
Alon-Fernandez de la Vega
-Kannan-Karpinski
6
Estimable graph parameters
A graph parameter f can be estimated from samples iff
(a) f(G(k)) is convergent as k
blow up every node
(b) If V(Gn)=V(Hn)=[n] and
1
max S,T 2 eG (S,T ) - eH (S,T ) ® 0
n
n
n
cut distance
then f(Gn)-f(Hn)0
(c) If |V(Gn)|, vV(Gn), then f(Gn)-f(Gn\v)0
Borgs-Chayes-L-Sós-Vesztergombi
Density of maximum cut is estimable.
December 2012
7
Nondeterministic parameter estimation
Divine help: coloring nodes and edges, orienting edges
L: directed, (edge)-colored graph
L’: forget orientation, delete some colors,
forget coloring; “shadow of G”
f: parameter defined on colored
oriented graphs
f’(G) = max {f(L): L’ = G} ; “shadow of f”
If f is estimable, then f’ is estimable.
L.-Vesztergombi
December 2012
8
Testable graph properties
P: graph property
P testable: for every ε > 0 there is a k0 ≥ 1 and a
there is a „test property” P’, such that
(a) for every graph G ∈ P and every k ≥ k0,
G(k,G) ∈ P′ with probability at least 2/3, and
(b) for every graph G with d1(G,P) > ε
and every k ≥ k0 we have G(k,G) ∈ P′
with probability at most 1/3.
December 2012
9
Testable graph properties
Example: triangle-free
G’: sampled induced subgraph
G’ not triangle-free  G not triangle free
G’ triangle-free  with high probability,
G has few triangles
Removal Lemma:
 ’ if t(,G)<’, then we can delete n2 edges
to get a triangle-free graph.
Ruzsa - Szemerédi
December 2012
10
Testable graph properties
Every hereditary graph property is testable
Alon-Shapira
inherited by
induced subgraphs
December 2012
11
Nondeterministically testable graph properties
Divine help: coloring nodes and edges, orienting edges
L: directed, (edge)-colored graph
L’: forget orientation, delete some colors,
forget coloring; “shadow of L”
Q: property of directed, colored graphs
Q’={G’: GQ}; “shadow of Q”
P nondeterministically testable: P=Q’, where Q
is a testable property of colored directed graphs.
December 2012
12
Nondeterministically testable graph properties
Examples: maximum cut contains
at least n2/100 edges
contains a spanning subgraph
with a testable property P
we can delete n2/100 edges
to get a graph with a testable property P
December 2012
13
Nondeterministically testable graph properties
Every nondeterministically testable graph property
is testable.
L-Vesztergombi
N=NP for dense
property testing
Proof via graph limit theory:
pure existence proof
of an algorithm...
December 2012
14
Computing a structure: representative set
Representative set of nodes: bounded size,
(almost) every node is “similar” to
one of the nodes in the set
When are two nodes similar? Neighbors?
Same neighborhood?
December 2012
15
Similarity distance of nodes
d sim (s, t ) := Ev Eu (asu avu ) - Ew (atw awv )
s
w
u
t
v
This is a metric, computable in the sampling model
Representative set U:
for any two nodes in U, dsim(s,t) > 
for most nodes s, dsim(U,s)  
December 2012
16
Similarity distance of nodes
Every graph contains a representative set
O(1/ e2 )
with at most 2
nodes.
Strong representative set U:
for any two nodes in U, dsim(s,t) > 
for every node, dsim(U,v)  
Every graph contains a strong representative set
O(log(1/ e ) / e2 )
with at most 2
December 2012
nodes.
Alon
17
Representative set and regularity partition
Voronoi diagram
= weak regularity partition
December 2012
18
Max cut in dense graphs
What answer to expect?
- Cannot list all nodes on one side
For any given node, we want to tell
on which side of the cut it lies
(after some preprocessing)
December 2012
19
How to compute a (weak) regularity partition?
Construct representative set U
Each node is in same class as closest representative.
December 2012
20
How to compute the maximum cut?
- Construct representative set U
- Compute pij = density between Voronoi cells Vi and Vj
(use sampling)
- Compute max cut (U1,U2) in complete graph on U
with edge-weights pij
Each node is on same side as closest representative.
(Different algorithm implicit by Frieze-Kannan.)
December 2012
21
Algorithms for bounded degree graphs
Bounded degree ( d)
- We can sample a uniform random node a
bounded number of times, and explore its
neighborhood to a bounded depth.
December 2012
22
Bad examples
Gn: random 3-regular graph
Fn: random 3-regular bipartite graph
Hn: GnGn
Large girth
graphs
(except for a
small part)
December 2012
Expander
graphs
(except for a
small part)
23
Algorithms for bounded degree graphs
Maximum cut cannot be estimated in this model
(random d-regular graph vs.
random bipartite d-regular graph)
P NP in this model
(random d-regular graph vs.
union of two random d-regular graphs)
December 2012
24
Algorithms for bounded degree graphs
Cellular automata (1970’s): network of finite automata
Distributed computing (1980’s): agent at every node,
bounded time
Constant time algorithms (2000’s): bounded number of
nodes sampled, explored at bounded depth
December 2012
25
Distributed computing model
December 2012
26
Algorithms for bounded degree graphs
(Almost) maximum matching can be computed
in bounded time.
Nguyen-Onak
(Almost) maximum flow and (almost) minimum cut
can be computed in bounded time.
Csóka
December 2012
27
Distributed computing model
December 2012
28
Algorithms for bounded degree graphs
(Almost) maximum matching can be computed
in bounded time.
Nguyen-Onak
need local random bits
(Almost) maximum flow and (almost) minimum cut
can be computed in bounded time.
Csóka
no random bits
need global random bits
December 2012
29
Algorithms for bounded degree graphs
Suppose you tell the agents any information about
the graph.
In the presence of global random bits,
this does not help!
Csóka
December 2012
30
Open problems
● Can an (almost) maximum independent node set be
computed in a random d-regular graph in this model?
● Can an (almost) maximum cut be
computed in a random d-regular graph in this model?
(Gn)/n tends to a limit with probability 1
MaxCut(Gn)/n tends to a limit with probability 1
Bayati-Gamarnik-Tetali 2010
December 2012
31
December 2012
32