Dimension reduction for trees

Transcript Dimension reduction for trees

Dimension reduction for finite trees in L1
James R. Lee
Mohammad Moharrami
University of Washington
Arnaud De Mesmay
École Normale Supérieure
0
1
0
0
1
0
0
1
1
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
1
0
0
1
dimension reduction in Lp
Given an n-point subset X µ Rd, find a mapping
k
F :X ! R
such that for all x, y 2 X,
kx ¡ ykp · kF (x) ¡ F (y)kp · D ¢kx ¡ ykp
n = size of X
k = target dimension
D = distortion
Dimension reduction as “geometric information theory”
the case p=2
When p=2, the Johnson-Lindenstrauss transform gives, for every
n-point subset X µ Rd and " > 0,
³
k= O
log n
"2
´
D · 1+ "
Applications to…
-
Statistics over data streams
Nearest-neighbor search
Compressed sensing
Quantum information theory
Machine learning
dimension reduction in L1
Natural to consider is p=1.
n = size of X
k = target dimension
D = distortion
History:
¡ n¢
- Caratheodory’s theorem yields D=1 and k = 2
-
[Schechtman’87, Bourgain-Lindenstrauss-Milman’89, Talagrand’90]
Linear mappings (sampling + reweighting) yield
D · 1+" and k = O
-
³
n log n
"2
´
[Batson-Spielman-Srivastava’09, Newman-Rabinovich’10]
Sparsification techniques yield
D · 1+" and k = O
¡
n
"2
¢
the Brinkman-Charikar lower bound
[Brinkman-Charikar’03]:
There are n-point subsets such that distortion D requires
k¸ n
- (1=D 2 )
[Brinkman-Karagiozova-L 07]
Lower bound tight for these spaces
Very technical argument based on LP-duality.
[L-Naor’04]:
One-page argument based on uniform convexity.
more lower bounds
[Andoni-Charikar-Neiman-Nguyen’11]:
There are n-point subsets such that distortion 1+" requires
k ¸ n1¡
O(1= log(1=" ))
[Regev’11]:
Simple, elegant, information-theoretic proof of both the
Brinkman-Charikar and ACNN lower bounds.
Low-dimensional embedding ) encoding scheme
the simplest of L1 objects
A tree metric is a graph theoretic tree T=(V, E) together with
non-negative lengths on the edges len : E ! [0; 1 )
4
Easy to embed isometrically into RE equipped with the L1 norm.
dimension reduction for trees in L1
Charikar and Sahai (2002) showed that for trees one can achieve
³
k= O
3
log n
"2
´
A. Gupta improved this to k = O
D · 1+ "
³
2
log n
"2
´
In 2003 in Princeton with Gupta and Talwar, we asked:
Is
k = O(logn) possible?
D = O(1)
even for complete binary trees?
dimension reduction for trees in L1
Theorem: For every n-point tree metric, one can achieve
k= O
¡
1
"4
log
¡ 1 ¢¢
"
¢logn
and D · 1 + "
(Can get k = O
³
log n
"2
´
for “symmetric” trees.)
Complete binary tree using local lemma
Schulman’s tree codes
Complete binary tree using re-randomization
Extension to general trees
dimension reduction for the complete binary tree
0
1
0
1
0
1
0
1
0
1
001
1
1
0
1
110
Every edge gets B bits ) target dimension = B log2n
Choose edge labels uniformly at random.
Nodes at tree distance - (logn) have probability n¡
to get labels with hamming distance - (logn)
O( B )
dimension reduction for the complete binary tree
0
1
0
1
0
1
0
1
0
1
1
1
001
0
1
110
Every edge gets B bits ) target dimension = B log2n
Choose edge labels uniformly at random.
Siblings have probability 2-B to have the same label, yet
there are n/2 of them.
Lovász Local Lemma
Pairs at distance L have probability 1 ¡ 2¡ - (L ) to be “good”
Number of dependent “distance L” events is 2O( L )
LLL + sum over levels ) good embedding
Schulman’s tree codes
-
LLL argument difficult to extend to arbitrary trees.
Same as construction of Schulman’96:
Tree codes for interactive communication
0
1
0
1
0
1
0
1
0
1
001
1
1
0
110
1
re-randomization
0
0
1
0
0
0
1
0
1
1
1
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
1
0
0
1
Random isometry:
For every level on the right, exchange 0’s and 1’s with probability half
(independently for each level)
re-randomization
0
0
1
0
0
0
1
0
1
1
1
0
1
1
0
0
1
1
0
1
0
1
0
Pairs at distance L have probability 1 ¡ 2¡
Number of pairs at distance L is 2O( L )
1
0
1
- (L )
1
0
0
to be “good”
1
extension to general trees
Unfortunately, the general case is technical (paper is 50 pages)
Obstacles:
General trees do not have O(log n) depth
Use “topological depth” of Matousek.
How many coordinates to change per edge, and by
what magnitude?
Multi-scale entropy functional
open problems
Coding/dimension reduction:
Extend/make explicit the connection between L1 dimension
reduction and information theory.
Close the gap: For distortion
n
or
10, is the right target dimension
n
0:01
?
Other Lp norms:
Nothing non-trivial is known for p 2
= f 1; 2; 1 g

Dimension reduction for trees

Transcript Dimension reduction for trees

Directory