Startseite // FDEF // Aktuelles // Lunchseminar in Economics: Self-guided Approximate Linear Programs

Lunchseminar in Economics: Self-guided Approximate Linear Programs

twitter linkedin facebook email this page
Add to calendar
Sprecher: Selvaprabu Nadarajah, University of Illinois at Chicago, USA
Veranstaltung: Mittwoch, den 09. Juni 2021 13:00 - 14:00
Ort: ONLINE ACCESS

Abstract

Approximate linear programs (ALPs) are well-known models based on value function approximations (VFAs) to obtain heuristic policies and lower bounds on the optimal policy cost of Markov decision processes (MDPs). The ALP VFA is a linear combination of predefined basis functions that are chosen using domain knowledge and updated heuristically if the ALP optimality gap is large. We side-step the need for such basis function engineering in ALP -- an implementation bottleneck -- by proposing a sequence of ALPs that embed increasing numbers of random basis functions obtained via inexpensive sampling. We provide a sampling guarantee and show that the VFAs from this sequence of models converge to the exact value function. Nevertheless, the performance of the ALP policy can fluctuate significantly as more basis functions are sampled. To mitigate these fluctuations, we ``self-guide'' our convergent sequence of ALPs using past VFA information such that a worst-case measure of policy performance is improved. We perform numerical experiments on perishable inventory control and generalized joint replenishment applications, which, respectively, give rise to challenging discounted-cost MDPs and average-cost semi-MDPs. We find that self-guided ALPs (i) significantly reduce policy cost fluctuations and improve the optimality gaps from an ALP approach that employs basis functions tailored to the former application, and (ii) deliver optimality gaps that are comparable to a known adaptive basis function generation approach targeting the latter application. More broadly, our methodology provides application-agnostic policies and lower bounds to benchmark approaches that exploit application structure.

 

Selvaprabu Nadarajah is an Assistant Professor of Information and Decision Sciences at the University of Illinois at Chicago (UIC) College of Business. Before joining UIC, he obtained his PhD in Operations Research from Carnegie Mellon University, where he won the Egon Balas best paper award and the William Cooper dissertation award. Selva's research studies (i) energy operations with a focus on firm-level issues faced by users and owners of conversion assets; (ii) the solution of large-scale dynamic optimization problems using reinforcement learning; and (iii) socially responsible and sustainable commerce.  His research has received the 2020 INFORMS Early Career Research Publication Award in Energy, Natural Resources, and the Environment  and the Best Overall Paper at the 2020 NeurIPS workshop on Tackling Climate Change with Machine Learning. He publishes in top-tier journals such as Management Science, Operations Research, Manufacturing and Service Operations Management, SIAM Journal on Optimization, and the Journal of Machine Learning Research

Daten: Selvaprabu Nadarajah 2021.06.09.pdf 330,47 kB