Dimension reduction for trees
Download
Report
Transcript Dimension reduction for trees
Dimension reduction for finite trees in L1
James R. Lee
Mohammad Moharrami
University of Washington
Arnaud De Mesmay
École Normale Supérieure
0
1
0
0
1
0
0
1
1
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
1
0
0
1
dimension reduction in Lp
Given an n-point subset X µ Rd, find a mapping
k
F :X ! R
such that for all x, y 2 X,
kx ¡ ykp · kF (x) ¡ F (y)kp · D ¢kx ¡ ykp
n = size of X
k = target dimension
D = distortion
Dimension reduction as “geometric information theory”
the case p=2
When p=2, the Johnson-Lindenstrauss transform gives, for every
n-point subset X µ Rd and " > 0,
³
k= O
log n
"2
´
D · 1+ "
Applications to…
-
Statistics over data streams
Nearest-neighbor search
Compressed sensing
Quantum information theory
Machine learning
dimension reduction in L1
Natural to consider is p=1.
n = size of X
k = target dimension
D = distortion
History:
¡ n¢
- Caratheodory’s theorem yields D=1 and k = 2
-
[Schechtman’87, Bourgain-Lindenstrauss-Milman’89, Talagrand’90]
Linear mappings (sampling + reweighting) yield
D · 1+" and k = O
-
³
n log n
"2
´
[Batson-Spielman-Srivastava’09, Newman-Rabinovich’10]
Sparsification techniques yield
D · 1+" and k = O
¡
n
"2
¢
the Brinkman-Charikar lower bound
[Brinkman-Charikar’03]:
There are n-point subsets such that distortion D requires
k¸ n
- (1=D 2 )
[Brinkman-Karagiozova-L 07]
Lower bound tight for these spaces
Very technical argument based on LP-duality.
[L-Naor’04]:
One-page argument based on uniform convexity.
more lower bounds
[Andoni-Charikar-Neiman-Nguyen’11]:
There are n-point subsets such that distortion 1+" requires
k ¸ n1¡
O(1= log(1=" ))
[Regev’11]:
Simple, elegant, information-theoretic proof of both the
Brinkman-Charikar and ACNN lower bounds.
Low-dimensional embedding ) encoding scheme
the simplest of L1 objects
A tree metric is a graph theoretic tree T=(V, E) together with
non-negative lengths on the edges len : E ! [0; 1 )
4
Easy to embed isometrically into RE equipped with the L1 norm.
dimension reduction for trees in L1
Charikar and Sahai (2002) showed that for trees one can achieve
³
k= O
3
log n
"2
´
A. Gupta improved this to k = O
D · 1+ "
³
2
log n
"2
´
In 2003 in Princeton with Gupta and Talwar, we asked:
Is
k = O(logn) possible?
D = O(1)
even for complete binary trees?
dimension reduction for trees in L1
Theorem: For every n-point tree metric, one can achieve
k= O
¡
1
"4
log
¡ 1 ¢¢
"
¢logn
and D · 1 + "
(Can get k = O
³
log n
"2
´
for “symmetric” trees.)
Complete binary tree using local lemma
Schulman’s tree codes
Complete binary tree using re-randomization
Extension to general trees
dimension reduction for the complete binary tree
0
1
0
1
0
1
0
1
0
1
001
1
1
0
1
110
Every edge gets B bits ) target dimension = B log2n
Choose edge labels uniformly at random.
Nodes at tree distance - (logn) have probability n¡
to get labels with hamming distance - (logn)
O( B )
dimension reduction for the complete binary tree
0
1
0
1
0
1
0
1
0
1
1
1
001
0
1
110
Every edge gets B bits ) target dimension = B log2n
Choose edge labels uniformly at random.
Siblings have probability 2-B to have the same label, yet
there are n/2 of them.
Lovász Local Lemma
Pairs at distance L have probability 1 ¡ 2¡ - (L ) to be “good”
Number of dependent “distance L” events is 2O( L )
LLL + sum over levels ) good embedding
Schulman’s tree codes
-
LLL argument difficult to extend to arbitrary trees.
Same as construction of Schulman’96:
Tree codes for interactive communication
0
1
0
1
0
1
0
1
0
1
001
1
1
0
110
1
re-randomization
0
0
1
0
0
0
1
0
1
1
1
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
1
0
0
1
Random isometry:
For every level on the right, exchange 0’s and 1’s with probability half
(independently for each level)
re-randomization
0
0
1
0
0
0
1
0
1
1
1
0
1
1
0
0
1
1
0
1
0
1
0
Pairs at distance L have probability 1 ¡ 2¡
Number of pairs at distance L is 2O( L )
1
0
1
- (L )
1
0
0
to be “good”
1
extension to general trees
Unfortunately, the general case is technical (paper is 50 pages)
Obstacles:
General trees do not have O(log n) depth
Use “topological depth” of Matousek.
How many coordinates to change per edge, and by
what magnitude?
Multi-scale entropy functional
open problems
Coding/dimension reduction:
Extend/make explicit the connection between L1 dimension
reduction and information theory.
Close the gap: For distortion
n
or
10, is the right target dimension
n
0:01
?
Other Lp norms:
Nothing non-trivial is known for p 2
= f 1; 2; 1 g