Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaption (EECS-2024-216)
Danny Halawi, Alexander Wei, Eric Wallace, Tony Wang, Nika Haghtalab and Jacob Steinhardt