Dr. Wei Qian (University of Delaware)
Title: An Adaptive Algorithm to Multi-armed Bandit Problem with High-dimensional Covariates
Abstract:
This work studies an important sequential decision making problem known as the multi-armed bandit problem with covariates. Under a linear bandit framework with high-dimensional covariates, we propose a general arm allocation algorithm that integrates both arm elimination and randomized assignment strategies. By employing a class of high-dimensional regression methods for coefficient estimation, the proposed algorithm is shown to have near optimal finite-time regret performance under a new study scope that requires neither a margin condition nor a reward gap condition for competitive arms. Based on synergistically verified benefit of the margin, our algorithm exhibits an adaptive performance that automatically adapts to the margin and gap conditions, and attains the optimal regret rates under both study scopes, without or with the margin, up to a logarithmic factor. The proposed algorithm also simultaneously generates useful coefficient estimation output for competitive arms and is shown to achieve both estimation consistency and variable selection consistency. Promising empirical performance is demonstrated through two real data evaluation examples in drug dose assignment and news article recommendation.
Speaker
Dr. Wei Qian is an Assistant Professor of Statistics at the University of Delaware. His research interests include high-dimensional statistics, sequential decision making, actuarial statistics, online recommendation, and data science applications. I am particularly interested in investigating statistical machine learning methods for analysis of complex and big data stemming from applied problems.