Bandit algorithms tackle the fundamental challenge of balancing exploration (collecting data for learning better models) and exploitation (using the estimates to make decisions). In this talk, I will formalize bandit problems with preference feedback (Is "A" better than "B"?), with structured decision spaces (a large set of arms that are correlated), and with safety concerns (when bad samples are not allowed). These constraints commonly exist in many applications. In particular, we are motivated by online decision-making for clinical treatment and robotic control. This talk will exhibit several algorithms for these constrained bandit problems. The theoretical guarantees and empirical efficiencies of our algorithms will be presented. I will also show some of our clinical results -- online decision-making for spinal cord injury therapy.
Yanan Sui is a postdoc researcher in Computing and Mathematical Sciences department at California Institute of Technology. He got his PhD degree at Caltech in Computation and Neural Systems (with minor in Applied and Computational Mathematics). Yanan received his Bachelor's Degree from Tsinghua University. His research interests are machine learning, neural engineering and robotics. He is currently working on theories for online learning with human in the loop and applications on neural rehabilitation and robotics.