Catalog Description: Topics will vary from semester to semester. See Computer Science Division announcements.
Units: 1.0-4.0
Formats:
Fall: 2.0-6.0 hours of lecture per week
Spring: 1.0-3.0 hours of lecture per week
Grading basis: letter
Final exam status: No final exam
Department Notes: This course will focus on the design and theoretical analysis of learning methods for sequential decision-making under uncertainty. Sequential decision problems involve a trade-off between exploitation (optimizing performance based on the information at hand) and exploration (gathering more information). These problems arise in many important domains, ranging from clinical trials, through computer network optimization and adaptive packet routing, to website and page content optimization, marketing campaign and internet advertising optimization, and revenue management. Topics covered will include a selection from the following list. Stochastic and game theoretic formulations of sequential decision problems: multi-armed bandits, linear, convex, and Lipschitz bandits, large-scale (combinatorial) bandit problems, contextual bandits, and reinforcement learning/approximate dynamic programming approaches to controlling Markov decision processes; and tools for finite-sample performance analysis. See the course website for more information.