Faculty Publications - Jiantao Jiao
Masters Reports
- T. Wu, B. Zhu, R. Zhang, Z. Wen, K. Ramchandran, and J. Jiao, "Pairwise Proximal Policy Optimization: Large Language Models Alignment via Comparative RL," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2024-21, April 2024.