Learning and Testing Submodular Functions with Bounded Range

advertisement
Learning and Testing
Submodular Functions
Grigory Yaroslavtsev
With Sofya Raskhodnikova
(SODA’13)
+ Work in progress
with Rocco Servedio
Columbia University
October 26, 2012
Submodularity
• Discrete analog of convexity/concavity, law of
diminishing returns
• Applications: optimization, algorithmic game theory
Let 𝒇: 2 𝑋 → [0, 𝑅]:
• Discrete derivative:
πœ•π‘₯ 𝒇 𝑆 = 𝒇 𝑆 ∪ {π‘₯} − 𝒇 𝑆 ,
π‘“π‘œπ‘Ÿ 𝑆 ⊆ 𝑋, π‘₯ ∉ 𝑆
• Submodular function:
πœ•π‘₯ 𝑓 𝑆 ≥ πœ•π‘₯ 𝑓 𝑇 , ∀ 𝑆 ⊆ 𝑇 ⊆ 𝑋, π‘₯ ∉ 𝑇
Exact learning
• Q: Reconstruct a submodular 𝒇: 2𝑋 → 𝑅 with poly(|X|)
queries (for all arguments)?
• A: Only Θ
𝑋 -approximation (multiplicative) possible
[Goemans, Harvey, Iwata, Mirrokni, SODA’09]
• Q: Only for 1 − πœ– -fraction of points (PAC-style
membership queries under uniform distribution)?
Pr
Pr 𝑋 𝑨 𝑺 = 𝒇 𝑺
π‘Ÿπ‘Žπ‘›π‘‘π‘œπ‘šπ‘›π‘’π‘ π‘  π‘œπ‘“ 𝑨 𝑺∼π‘ˆ(2 )
• A: Almost as hard [Balcan, Harvey, STOC’11].
learning with
1
≥1−πœ– ≥
2
Approximate learning
• PMAC-learning (Multiplicative), with poly(|X|) queries :
Pr
Pr
π‘Ÿπ‘Žπ‘›π‘‘π‘œπ‘šπ‘›π‘’π‘ π‘  π‘œπ‘“ 𝑨 𝑺∼π‘ˆ(2𝑋 )
1
3
٠𝑋
𝒇 𝑺 ≤ 𝑨 𝑺 ≤ πœΆπ’‡ 𝑺
≤𝜢≤𝑂
• PAAC-learning (Additive)
Pr
Pr 𝑋
π‘Ÿπ‘Žπ‘›π‘‘π‘œπ‘šπ‘›π‘’π‘ π‘  π‘œπ‘“ 𝑨 𝑺∼π‘ˆ(2 )
• Running time: 𝑋
𝑂
𝑋
(over arbitrary distributions [BH’11])
1
|𝒇 𝑺 − 𝑨 𝑺 | ≤ 𝜷 ≥ 1 − πœ– ≥
2
|𝑅| 2
𝜷
• Running time: poly 𝑋
Lee, SODA’12]
1
≥1−πœ– ≥
2
1
log(πœ–)
𝑅 2
𝜷
[Gupta, Hardt, Roth, Ullman, STOC’11]
, log
1
πœ–
[Cheraghchi, Klivans, Kothari,
𝑋
Learning 𝑓: 2 → [0, 𝑅]
• For all algorithms πœ– = π‘π‘œπ‘›π‘ π‘‘.
Learning
Time
Extra
features
Goemans,
Harvey,
Iwata,
Mirrokni
Balcan,
Harvey
𝑂
𝑋 approximation
Everywhere
PMAC
Multiplicative 𝜢
Poly(|X|)
Poly(|X|)
𝜢=𝑂
Gupta,
Hardt,
Roth,
Ullman
Cheraghchi, Our result with
Klivans,
Sofya
Kothari,
Lee
PAAC
Additive 𝜷
𝑋
Under arbitrary
distribution
𝑋
Tolerant
queries
𝑂
|𝑅| 2
𝜷
SQqueries,
Agnostic
PAC
𝒇: πŸπ‘Ώ → 𝟎, … , 𝑹
(bounded integral
range 𝑅 ≤ |𝑋|)
𝑋
3
𝑅
𝑂( 𝑅 ⋅log 𝑅 )
Agnostic
Learning: Bigger picture
⊆
Subadditive
⊆
XOS = Fractionally subadditive
}
[Badanidiyuru, Dobzinski,
Fu, Kleinberg, Nisan,
Roughgarden,SODA’12]
⊆
Submodular
⊆
Gross substitutes
OXS
Additive
(linear)
Value demand
Other positive results:
• Learning valuation functions [Balcan,
Constantin, Iwata, Wang, COLT’12]
• PMAC-learning (sketching) valuation functions
[BDFKNR’12]
• PMAC learning Lipschitz submodular functions
[BH’10] (concentration around average via
Talagrand)
Discrete convexity
• Monotone convex 𝑓: {1, … , 𝑛} → 0, … , 𝑅
8
6
4
2
0
1
2
3
… <=R …
…
…
…
…
…
…
…
n
…
…
n
• Convex 𝑓: {1, … , 𝑛} → 0, … , 𝑅
8
6
4
2
0
1
2
3
… <=R …
…
…
…>= n-R…
𝑋
Discrete submodularity 𝑓: 2 → {0, … , 𝑅}
• Case study: 𝑅 = 1 (Boolean submodular functions 𝑓: 0,1 𝑛 → {0,1})
Monotone submodular = π‘₯𝑖1 ∨ π‘₯𝑖 2 ∨ β‹― ∨ π‘₯π‘–π‘Ž (monomial)
Submodular = (π‘₯𝑖1 ∨ β‹― ∨ π‘₯π‘–π‘Ž ) ∧ (π‘₯𝑗1 ∨ β‹― ∨ π‘₯𝑗𝑏 ) (2-term CNF)
• Monotone submodular
• Submodular
𝑋
𝑋
𝑺 ≥ 𝑿 −𝑹
𝑺 ≤𝑹
𝑺 ≤𝑹
∅
∅
Discrete monotone submodularity
• Monotone submodular 𝑓: 2𝑋 → 0, … , 𝑅
≥ π’Žπ’‚π’™(𝒇 π‘ΊπŸ , 𝒇(π‘ΊπŸ ))
≥ 𝒇(π‘ΊπŸ )
≥ 𝒇(π‘ΊπŸ )
𝑺 ≤𝑹
𝒇(π‘ΊπŸ )
𝒇(π‘ΊπŸ )
Discrete monotone submodularity
• Theorem: for monotone submodular 𝑓: 2𝑋 →
0, … , 𝑅 for all 𝑇: 𝑓 𝑇 = max 𝑓(𝑆)
𝑆⊆𝑇, 𝑆 ≤𝑅
• 𝑓 𝑇 ≥
max
𝑆⊆𝑇, 𝑆 ≤𝑅
𝑓(𝑆) (by monotonicity)
𝑇
𝑺 ≤𝑹
𝑺 ⊆ 𝑻,
𝑺 ≤𝑹
Discrete monotone submodularity
• 𝑓 𝑇 ≤
max
𝑆⊆𝑇, 𝑆 ≤𝑅
𝑓 𝑆
• S’ = smallest subset of 𝑇 such that 𝑓 𝑇 = 𝑓 𝑆’
• ∀π‘₯ ∈ 𝑆 ′ we have πœ•π‘₯ 𝑆 ′ βˆ– π‘₯ > 0 =>
Restriction of 𝑓
′
𝑆
on 2
is monotone increasing =>|𝑆’| ≤ 𝑅
𝑇
𝑆′
𝑺 ≤𝑹
𝑆′
Representation by a formula
• Theorem: for monotone submodular 𝑓: 2𝑋 → 0, … , 𝑅
for all 𝑇:
𝑓 𝑇 = max 𝑓(𝑆)
𝑆⊆𝑇, 𝑆 ≤𝑅
𝑋
• Notation switch: 𝑋 → 𝑛, 2 → (π‘₯1 , … , π‘₯𝑛 )
• (Monotone) Pseudo−Boolean k−DNF
( ∨ → π’Žπ’‚π’™, 𝐴𝑖 = 1 → π‘¨π’Š ∈ ℝ):
π’Žπ’‚π’™π’Š [π‘¨π’Š ⋅ π‘₯𝑖1 ∧ π‘₯𝑖2 ∧ β‹― ∧ π‘₯π‘–π’Œ ] (no negations)
• (Monotone) submodular 𝑓 π‘₯1 , … , π‘₯𝑛 → 0, … , 𝑹 can be
represented as a (monotone) pseudo-Boolean 2R-DNF with
constants 𝐴𝑖 ∈ 0, … , 𝑹 .
Discrete submodularity
• Submodular 𝑓 π‘₯1 , … , π‘₯𝑛 → 0, … , 𝑹 can be
represented as a pseudo-Boolean 2R-DNF
with constants 𝐴𝑖 ∈ 0, … , 𝑹 .
• Hint [Lovasz] (Submodular monotonization):
Given submodular 𝒇, define
π’‡π’Žπ’π’ 𝑺 = π’Žπ’‚π’™π‘»⊆𝑺 𝒇 𝑻
Then π’‡π’Žπ’π’ is monotone and submodular.
𝑋
𝑺 ≥ 𝑿 −𝑹
𝑺 ≤𝑹
∅
Learning pB-formulas and k-DNF
• 𝐷𝑁𝐹 π’Œ,𝑹 = class of pB-DNF of width π’Œ with 𝐴𝑖 ∈ {0, … , 𝑹}
• i-slice π’‡π’Š π‘₯1 , … , π‘₯𝑛 → {0,1} defined as
π’‡π’Š π‘₯1 , … , π‘₯𝑛 = 1 iff
𝒇 π‘₯1 , … , π‘₯𝑛 ≥ 𝑖
• If 𝒇 ∈ 𝐷𝑁𝐹 π’Œ,𝑹 its i-slices π’‡π’Š are π’Œ-DNF and:
𝒇 π‘₯1 , … , π‘₯𝑛 = max (𝑖 ⋅ π’‡π’Š π‘₯1 , … , π‘₯𝑛 )
1≤𝑖≤𝑹
• PAC-learning
Pr
Pr
π‘Ÿπ‘Žπ‘›π‘‘(𝑨) 𝑺∼π‘ˆ( 0,1
𝑛)
𝑨 𝑺 =𝒇 𝑺
1
≥1−πœ– ≥
2
Learning pB-formulas and k-DNF
• Learn every i-slice π’‡π’Š on 1 − πœ– ′ = (1 − πœ– / 𝑅) fraction of
arguments
• Learning π’Œ-DNF (𝐷𝑁𝐹 π’Œ,𝑹 ) (let Fourier sparsity 𝑺𝑭 =
𝑹
π’Œ log πœ–
π’Œ
)
– Kushilevitz-Mansour (Goldreich-Levin): π‘π‘œπ‘™π‘¦ 𝑛, 𝑺𝑭 queries/time.
– ``Attribute efficient learning’’: π‘π‘œπ‘™π‘¦π‘™π‘œπ‘” 𝑛 ⋅ π‘π‘œπ‘™π‘¦ 𝑺𝑭 queries
– Lower bound: Ω(2π’Œ ) queries to learn a random π’Œ-junta (∈ π’Œ-DNF) up
to constant precision.
• Optimizations:
– Slightly better than KM/GL by looking at the Fourier spectrum of
𝐷𝑁𝐹 π’Œ,𝑹 (see SODA paper: switching lemma => 𝐿1 bound)
– Do all R iterations of KM/GL in parallel by reusing queries
Property testing
• Let C be the class of submodular 𝒇: 0,1 𝑛 → {0, … , 𝑹}
• How to (approximately) test, whether a given 𝒇 is in C?
• Property tester: (Randomized) algorithm for distinguishing:
1. 𝑓 ∈ π‘ͺ
2. (𝝐-far): min 𝒇 – π’ˆ ≥ 𝝐 2𝑛
𝑔∈ π‘ͺ
• Key idea: π’Œ-DNFs have small representations:
– [Gopalan, Meka,Reingold CCC’12] (using quasi-sunflowers [Rossman’10])
∀πœ– > 0, ∀ π’Œ-DNF formula F there exists:
π’Œ-DNF formula F’ of size ≤ π’Œ
1 𝑂(π’Œ)
log
𝝐
such that 𝐹 – 𝐹’ ≤ 𝝐2𝑛
Testing by implicit learning
• Good approximation by juntas => efficient property
testing [surveys: Ron; Servedio]
– πœ–-approximation by 𝐽(πœ–)-junta
– Good dependence on 𝝐: 𝐽 πœ– = π’Œ
1 𝑂(π’Œ)
log
𝝐
• [Blais, Onak] sunflowers for submodular functions
1
[𝑂 π’Œ log π’Œ + log 𝝐 ](π’Œ+𝟏)
– Query complexity:
1 𝑂 (π’Œ)
π’Œ log
𝝐
(independent of n)
– Running time: exponential in 𝐽 πœ– (we think can be reduced it
to O(𝐽(πœ–)) )
– We have Ω π’Œ lower bound for testing π’Œ-DNF (reduction
from Gap Set Intersection: distinguishing a random π’Œ-junta
vs π’Œ + O(log π’Œ)-junta requires Ω π’Œ queries)
Previous work on testing submodularity
𝒇: 0,1 𝑛 → [0, 𝑅] [Parnas, Ron, Rubinfeld ‘03, Seshadhri, Vondrak,
ICS’11]:
• Upper bound (𝟏/𝝐)𝑢( 𝒏) .
} Gap in query complexity
• Lower bound: 𝛀 𝒏
Special case: coverage functions [Chakrabarty, Huang, ICALP’12].
Thanks!
Download