Geometry of Online Packing Linear Programs

Download Report

Transcript Geometry of Online Packing Linear Programs

Geometry of Online Packing
Linear Programs
Marco Molinaro and R. Ravi
Carnegie Mellon University
Packing Integer Programs (PIPs)
β€’ Non-negative c, A, b
β€’ Max 𝑐π‘₯
n
st
A
π‘₯ ∈ {0,1}𝑛
β€’ A has entries in [0,1]
x
≀
b
m
Online Packing Integer Programs
β€’
β€’
β€’
β€’
Adversary chooses values for c, A, b
…but columns are presented in random order
…when column comes, set variable to 0/1 irrevocably
b and n are known upfront
A
c
A
n
1
0
x
≀
b
Online Packing Integer Programs
β€’ Goal: Find feasible solution that maximizes
expected value
β€’ 1 βˆ’ πœ– -competitive:
E ALG β‰₯ 1 βˆ’ πœ– OfflineIP
Previous Results
β€’ First online problem: secretary problem [Dynkin 63]
β€’ B-secretary problem (m=1, b=B, A is all 1’s) [Kleinberg 05]
(1 βˆ’ πœ–)-competitive for 𝐡 = Ξ©
do not depend
on n
1
πœ–2
β€’ PIPs (B=min bi) [FHKMS 10, AWY]
(1 βˆ’ πœ–)-competitive for 𝐡 = Ξ©
need 𝐡 = Ω
depends on n
π‘š
log
2
πœ–
1
log
2
πœ–
𝑛
π‘š
Main Question and Result
β€’ Q: Do general PIPs become more difficult for
larger n?
β€’ A: No!
Main result
Algorithm 1 βˆ’ πœ– -competitive when 𝐡 = Ξ©
π‘š2
log
πœ–2
π‘š
High-level Idea
1. Online PIP as learning
2. Improving learning error using tailored covering
bounds
3. Geometry of PIPs that allow good covering bounds
4. Reduce general PIP to above
For this talk:
β€’ 𝒄=1
β€’ Every right-hand side 𝑏𝑖 = 𝐡
β€’ Show weaker bound 𝐡 = Ξ©
π‘š2
log
πœ–3
π‘š
Online PIP as Learning
1) Reduction to learning a classifier [DH 09]
Linear classifier: given (dual) vector 𝑝 ∈ π‘…π‘š ,
π‘₯(𝑝)𝑑 = 1 iff 𝑝𝐴𝑑 βˆ’ 1 < 0
𝑨𝒕
0
0
1
1
1
0
𝒙(𝒑)𝒕
1
𝒑
Online PIP as Learning
1) Reduction to learning a classifier [DH 09]
Linear classifier: given (dual) vector 𝑝 ∈ π‘…π‘š ,
π‘₯(𝑝)𝑑 = 1 iff 𝑝𝐴𝑑 βˆ’ 1 < 0
Claim: If the classification π‘₯(𝑝) given by 𝑝 satisfies
1) [Feasible]
𝑑 𝐴𝑑𝑖 π‘₯(𝑝)𝑑
≀𝐡
2) [Packs tightly] If 𝑝𝑖 > 0, then
𝑑 𝐴𝑑𝑖 π‘₯(𝑝)𝑑
β‰₯ 1βˆ’πœ– 𝐡
then π‘₯(𝑝) is (1βˆ’πœ–) optimal.
Moreover, such classification always exists.
Online PIP as Learning
1) Reduction to learning [DH 09]
Linear classifier: given (dual) vector 𝑝 ∈ π‘…π‘š , set
π‘₯(𝑝)𝑑 = 1 iff 𝑝𝐴𝑑 βˆ’ 1 < 0
Claim: If the classification π‘₯(𝑝) given by 𝑝 satisfies
1) [Feasible]
𝑑 𝐴𝑑𝑖 π‘₯(𝑝)𝑑
≀𝐡
2) [Packs tightly] If 𝑝𝑖 > 0, then
𝑑 𝐴𝑑𝑖 π‘₯(𝑝)𝑑
β‰₯ 1βˆ’πœ– 𝐡
then π‘₯(𝑝) is (1βˆ’πœ–) optimal.
Moreover, such classification always exists.
Online PIP as Learning
2) Solving PIP via learning
a) See first πœ– fraction of columns
b) Compute appropriate 𝑝 for sampled IP
[𝑝𝑖 > 0],
π‘‘β‰€πœ–π‘› 𝐴𝑑𝑖 π‘₯(𝑝)𝑑
≀ πœ–π΅ βˆ’πœ– 2 𝐡
π‘‘β‰€πœ–π‘› 𝐴𝑑𝑖 π‘₯(𝑝)𝑑
β‰₯ 1 βˆ’ πœ– πœ–π΅ +πœ– 2 𝐡
c) Use 𝑝 to classify remaining columns
Online PIP as Learning
2) Solving PIP via learning
𝒑
a) See first πœ– fraction of columns
b) Compute appropriate 𝑝 for sampled IP
[𝑝𝑖 > 0],
π‘‘β‰€πœ–π‘› 𝐴𝑑𝑖 π‘₯(𝑝)𝑑
≀ πœ–π΅ βˆ’πœ– 2 𝐡
π‘‘β‰€πœ–π‘› 𝐴𝑑𝑖 π‘₯(𝑝)𝑑
β‰₯ 1 βˆ’ πœ– πœ–π΅ +πœ– 2 𝐡
c) Use 𝑝 to classify remaining columns
Probability of learning good classifier:
β€’ Consider a classifier 𝑝 that overfills some budget:
𝑑 𝐴𝑑𝑖 π‘₯(𝑝)𝑑 > 𝐡
β€’ Can only learn 𝑝 if sample is skewed. Happens with probability at most exp(βˆ’πœ– 3 𝐡)
β€’ At most ~ π‘›π‘š distinct bad classifiers
β€’ Union bounding over all bad classifiers, learn bad classifier with prob. at most
π‘›π‘š exp(βˆ’πœ– 3 𝐡)
π‘š
β€’ When 𝐡 = Ξ© 3 log 𝑛 to get good classifier with high probability
πœ–
Online PIP as Learning
2) Solving PIP via learning
Improve this…
Probability of learning good classification:
β€’ Consider a classification 𝑝 that overfills some budget:
β€’ Can only learn 𝑝 if sample is skewed. Happens with probability at most exp(βˆ’πœ– 3 𝐡)
β€’ At most ~ π‘›π‘š distinct bad classifications
β€’ Union bounding over all bad classifications, learn desired good classification with
prob. at least 1 βˆ’ π‘›π‘š exp(πœ– 3 𝐡)
β€’ When
to get good classification with high probability
Improved Learning Error
β€’ Idea 1: Covering bounds via witnesses
(handling multiple bad classifiers at a time)
+-witness: 𝑝 is a +-witness of 𝑝’ for constraint 𝑖 if
1) Columns picked by 𝑝 βŠ† columns picked by 𝑝’
2) Total occupation of constraint 𝑖 by columns picked by 𝑝 is > 𝐡
βˆ’-witness: similar…
Lemma: Suppose there is a witness set of size 𝑀.
Then probability of learning a bad classifier is ~𝑀. exp(βˆ’πœ– 3 𝐡)
Total weight > 𝐡
Geometry of PIPs with Small Witness Set
β€’ For some PIPs, size of witness set is at least 𝑀 > Ξ©(𝑛)
β€’ Idea 2: Consider PIPs whose columns lie on few (𝐾) 1-d
subspaces
Geometry of PIPs with Small Witness Set
β€’ For some PIPs, size of witness set is at least 𝑀 > Ξ©(𝑛)
β€’ Idea 2: Consider PIPs whose columns lie on few (𝐾) 1-d
subspaces
𝑲=2
Lemma: For such PIPs, can find witness set of size 𝑂
𝐾 π‘š
πœ–
Geometry of PIPs with Small Witness Set
β€’ Covering bound + witness size: it suffices 𝐡 β‰₯
β€’ Final step: Convert any PIP into one with 𝐾 =
π‘š
Ω 3 log 𝐾
πœ–
π‘š π‘š
πœ–
Algorithm 1 βˆ’ πœ– -competitive when 𝐡 = Ξ©
, loses πœ– value
π‘š2
log
πœ–3
π‘š
Conclusion
β€’ Guarantee for online PIPs independent of number of columns 𝑛
β€’ Asymptotically matches that for single constraint version [Kleinberg 05]
β€’ Ideas
1) Tailored covering bound based on witnesses
2) Analyze geometry of columns to obtain small witness set
β‡’ Make the learning problem more robust
Open problems
1) Obtain optimal 𝐡 β‰₯ Ξ©
replacement [DJSW 11]
1
log π‘š
πœ–2
? Can do if sample columns with
2) Generalize to AdWords-type problem
3) Better online models: infinite horizon? less randomness?
Thank you!