Controlling Long-Form Large Language Model Outputs
Kevin Yang
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2024-8
March 4, 2024
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-8.pdf
As large language models have greatly increased in capability in recent years, it becomes increasingly important to improve our ability to exert control over their outputs. In this thesis, I discuss several such control schemes I have developed, ranging from pure inference-time control to finetuning-based alignment methods. I will first discuss highly general methods that apply to unstructured natural language generation, including both an inference-time control scheme called FUDGE as well as a reinforcement-learning based finetuning approach called RLCD. I will next discuss more specialized methods that can be used for control in more structured domains such as molecule design, program synthesis, and semantic parsing. Finally, I will show how many of these ideas can be used in conjunction with structured planning via prompting to extend our control to much longer outputs—in the range of thousands of words—in an automatic story generation application.
Advisors: Daniel Klein
BibTeX citation:
@phdthesis{Yang:EECS-2024-8, Author= {Yang, Kevin}, Title= {Controlling Long-Form Large Language Model Outputs}, School= {EECS Department, University of California, Berkeley}, Year= {2024}, Month= {Mar}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-8.html}, Number= {UCB/EECS-2024-8}, Abstract= {As large language models have greatly increased in capability in recent years, it becomes increasingly important to improve our ability to exert control over their outputs. In this thesis, I discuss several such control schemes I have developed, ranging from pure inference-time control to finetuning-based alignment methods. I will first discuss highly general methods that apply to unstructured natural language generation, including both an inference-time control scheme called FUDGE as well as a reinforcement-learning based finetuning approach called RLCD. I will next discuss more specialized methods that can be used for control in more structured domains such as molecule design, program synthesis, and semantic parsing. Finally, I will show how many of these ideas can be used in conjunction with structured planning via prompting to extend our control to much longer outputs—in the range of thousands of words—in an automatic story generation application.}, }
EndNote citation:
%0 Thesis %A Yang, Kevin %T Controlling Long-Form Large Language Model Outputs %I EECS Department, University of California, Berkeley %D 2024 %8 March 4 %@ UCB/EECS-2024-8 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-8.html %F Yang:EECS-2024-8