Blind online optimization Gradient descent without a gradient Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU Standard convex optimization Convex feasible set S ½ <d Concave function f : S ! < Goal: find x f(x) ¸ maxz2Sf(z) – = f(x*) - } x* Rd • Move in the direction of steepest ascent • Compute f’(x) (rf(x) in higher dimensions) • Works for convex optimization (and many other problems) x1 x2x3x4 Typical application • Company produces certain numbers of cars per month • Vector x 2 <d (#Corollas, #Camrys, …) • Profit of company is concave function of production vector • Maximize total (eq. average) profit PROBLEMS Problem definition and results • Sequence of unknown concave functions • period t: pick xt 2 S, find out only ft(xt) • convex Theorem: Online model expected regret • Holds for arbitrary sequences • Stronger than stochastic model: – f1, f2, …, i.i.d. from D – x* = arg minx2S ED[f(x)] Outline • Problem definition • Simple algorithm • Analysis sketch • Variations • Related work & applications First try Zinkevich ’03: PROFIT f4(x4) f3(x3) f2(x2) f4 f1(x1) f3 f2 If we could only compute gradients… f1 x4 x3 x2 x* x1 #CAMRYS Idea: one point gradient With probability ½, estimate = f(x + )/ PROFIT With probability ½, estimate = –f(x – )/ E[ estimate ] ¼ f’(x) x-x x+ #CAMRYS d-dimensional online algorithm x3 x4 x1 S x2 Outline • Problem definition • Simple algorithm • Analysis sketch • Variations • Related work & applications Analysis ingredients • E[1-point estimate] is gradient of • is small • Online gradient ascent analysis [Z03] • Online expected gradient ascent analysis • (Hidden complications) PROFIT 1-pt gradient analysis x- x+ #CAMRYS 1-pt gradient analysis (d-dim) • E[1-point estimate] is gradient of • is small 2 • • 1 Online gradient ascent [Z03] • • • (concave, bounded gradient) Expected gradient ascent analysis • Regular deterministic gradient ascent on gt (concave, bounded gradient) Hidden complication… S Hidden complication… S Hidden complication… S’ Hidden complication… Thin sets are bad S Hidden complication… Round sets are good …reshape into “isotropic position” [LV03] Outline • Problem definition • Simple algorithm • Analysis sketch • Variations • Related work & applications Variations diameter gradient bound • • Works against adaptive adversary – Chooses ft knowing x1, x2, …, xt-1 • Also works if we only get a noisy estimate of ft(xt), i.e. E[ht(xt)|xt]=ft(xt) Related convex optimization Regular Sighted Blind (see entire function(s)) (evaluations only) Gradient descent, ... Ellipsoid, Random walk [BV02], Sim. annealing [KV05], Finite difference Gradient descent (stoch.) Finite difference Gradient descent (online) [Z03] 1-pt. gradient appx. [BKM04] Finite difference [Kleinberg04] (single f) Stochastic (dist over f’s or dist over errors) Online (f1, f2, f3, …) 1-pt. gradient appx. [G89,S97] Multi-armed bandit (experts) S 2 3 5 1 2 3 0 5 2 0 2 5 2 3 5 0 [R52,ACFS95,…] Driving to work (online routing) 25 Exponentially many paths… Exponentially many slot machines? Finite dimensions Exploration/exploitation tradeoff [TW02,KV02, AK04,BM04] S Online product design Conclusions and future work • Can “learn” to optimize a sequence of unrelated functions from evaluations • Answer to: “What is the sound of one hand clapping?” • Applications – Cholesterol – Paper airplanes – Advertising • Future work – Many players using same algorithm (game theory)