50-min ppt - Microsoft Research

advertisement
Blind online optimization
Gradient descent without a gradient
Abie Flaxman CMU
Adam Tauman Kalai TTI
Brendan McMahan CMU
Standard convex optimization
Convex feasible set S ½ <d
Concave function f : S ! <
Goal: find x
f(x) ¸ maxz2Sf(z) –  = f(x*) - 
}
x*
Rd
• Move in the direction of steepest ascent
• Compute f’(x) (rf(x) in higher dimensions)
• Works for convex optimization
(and many other problems)
x1 x2x3x4
Typical application
• Company produces certain numbers of
cars per month
• Vector x 2 <d (#Corollas, #Camrys, …)
• Profit of company is concave function of
production vector
• Maximize total (eq. average) profit
PROBLEMS
Problem definition and results
• Sequence of unknown concave functions
• period t: pick xt 2 S, find out only ft(xt)
• convex
Theorem:
Online model
expected
regret
• Holds for arbitrary sequences
• Stronger than stochastic model:
– f1, f2, …, i.i.d. from D
– x* = arg minx2S ED[f(x)]
Outline
• Problem definition
• Simple algorithm
• Analysis sketch
• Variations
• Related work & applications
First try
Zinkevich ’03:
PROFIT
f4(x4)
f3(x3)
f2(x2)
f4
f1(x1) f3
f2 If we could only compute gradients…
f1
x4 x3 x2 x*
x1
#CAMRYS
Idea: one point gradient
With probability ½, estimate = f(x + )/
PROFIT
With probability ½, estimate = –f(x – )/
E[ estimate ] ¼ f’(x)
x-x x+
#CAMRYS
d-dimensional online algorithm
x3
x4
x1
S
x2
Outline
• Problem definition
• Simple algorithm
• Analysis sketch
• Variations
• Related work & applications
Analysis ingredients
• E[1-point estimate] is gradient of
•
is small
• Online gradient ascent analysis [Z03]
• Online expected gradient ascent analysis
• (Hidden complications)
PROFIT
1-pt gradient analysis
x- x+
#CAMRYS
1-pt gradient analysis (d-dim)
• E[1-point estimate] is gradient of
•
is small 2
•
•
1
Online gradient ascent [Z03]
•
•
•
(concave,
bounded gradient)
Expected gradient ascent analysis
• Regular deterministic gradient ascent on gt
(concave,
bounded gradient)
Hidden complication…
S
Hidden complication…
S
Hidden complication…
S’
Hidden complication…
Thin sets are bad
S
Hidden complication…
Round sets are good
…reshape into
“isotropic position”
[LV03]
Outline
• Problem definition
• Simple algorithm
• Analysis sketch
• Variations
• Related work & applications
Variations
diameter
gradient
bound
•
• Works against adaptive adversary
– Chooses ft knowing x1, x2, …, xt-1
• Also works if we only get a noisy estimate
of ft(xt), i.e. E[ht(xt)|xt]=ft(xt)
Related convex optimization
Regular
Sighted
Blind
(see entire function(s))
(evaluations only)
Gradient descent, ...
Ellipsoid, Random walk [BV02],
Sim. annealing [KV05],
Finite difference
Gradient descent (stoch.)
Finite difference
Gradient descent (online)
[Z03]
1-pt. gradient appx. [BKM04]
Finite difference [Kleinberg04]
(single f)
Stochastic
(dist over f’s or
dist over errors)
Online
(f1, f2, f3, …)
1-pt. gradient appx.
[G89,S97]
Multi-armed bandit (experts)
S
2
3
5
1
2
3
0
5
2
0
2
5
2
3
5
0
[R52,ACFS95,…]
Driving to work (online routing)
25
Exponentially many paths…
Exponentially many slot machines?
Finite dimensions
Exploration/exploitation tradeoff
[TW02,KV02,
AK04,BM04]
S
Online product design
Conclusions and future work
• Can “learn” to optimize a sequence of unrelated
functions from evaluations
• Answer to:
“What is the sound of one hand clapping?”
• Applications
– Cholesterol
– Paper airplanes
– Advertising
• Future work
– Many players using same algorithm
(game theory)
Download