Master's Theses & Technical Reports - Jiantao Jiao
M.S.
Pairwise Proximal Policy Optimization: Large Language Models Alignment via Comparative RL
Tianhao Wu [2024]
Pairwise Proximal Policy Optimization: Large Language Models Alignment via Comparative RL
Tianhao Wu [2024]
Comparative Studies on Sample Complexity Bounds in Multi-Agent Reinforcement Learning
Jiaqi Yang [2022]
5th Year M.S.
First Token Probabilities are Unreliable Indicators for LLM Knowledge
Justin Shao [2024]
Theory and Application of Bonus-based Exploration in Reinforcement Learning
Bryan Chen [2021]