CS 294-102. Learning in Sequential Decision Problems

Catalog Description: Topics will vary from semester to semester. See Computer Science Division announcements.

Units: 1.0-4.0

Formats:
Fall: 2.0-6.0 hours of lecture per week
Spring: 1.0-3.0 hours of lecture per week

Grading basis: letter

Final exam status: No final exam


Class homepage on inst.eecs

General Catalog


Department Notes: This course will focus on the design and theoretical analysis of learning methods for sequential decision-making under uncertainty. Sequential decision problems involve a trade-off between exploitation (optimizing performance based on the information at hand) and exploration (gathering more information). These problems arise in many important domains, ranging from clinical trials, through computer network optimization and adaptive packet routing, to website and page content optimization, marketing campaign and internet advertising optimization, and revenue management. Topics covered will include a selection from the following list. Stochastic and game theoretic formulations of sequential decision problems: multi-armed bandits, linear, convex, and Lipschitz bandits, large-scale (combinatorial) bandit problems, contextual bandits, and reinforcement learning/approximate dynamic programming approaches to controlling Markov decision processes; and tools for finite-sample performance analysis. See the course website for more information.