Information Relaxations and Duality in Stochastic Dynamic Programs

advertisement
Information Relaxations and Duality in Stochastic Dynamic Programs
In this talk, we discuss the information relaxation approach for obtaining bounds on the performance of
optimal policies in stochastic dynamic programs (DP). This approach involves relaxing the DP
information structure and incorporating a penalty that punishes the use of additional information. We
first provide an overview of some basic theory for the general approach. We then discuss how to apply
the method in two broad classes of problems. For DPs with a convex structure, we show how to use
gradient penalties and convex optimization to apply the method, and illustrate with an application in
network revenue management. For infinite horizon MDPs, we show how to apply the method with
change of measure techniques, and illustrate with an application in service allocation for a multiclass
queue with convex delay costs. As we discuss, in both cases, the method provides tighter bounds than
bounds from other relaxation methods, such as Lagrangian relaxations.
Download