Machine Learning Safety
Jiaming Zou
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2022-147
May 19, 2022
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-147.pdf
In this work, we tackle the general topic of Machine Learning Safety from four different angles: robustness, anomaly detection, alignment, and systemic safety. Concretely, we introduce PixMix to comprehensively improve performance on robustness, calibration, consistency, and monitoring. We curate the Species dataset for large-scale anomaly detection. We create the Jiminy Cricket game environments to measure ML agent's understanding of and execution according to morality. We collect a large suite of emotionally evocative videos to show traction on preference learning. Additionally, we curate the MMLU benchmark to measure large language models' knowledge across 57 different domains and a forecasting benchmark to measure their ability to predict future trends and events.
Advisors: Dawn Song
BibTeX citation:
@mastersthesis{Zou:EECS-2022-147, Author= {Zou, Jiaming}, Title= {Machine Learning Safety}, School= {EECS Department, University of California, Berkeley}, Year= {2022}, Month= {May}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-147.html}, Number= {UCB/EECS-2022-147}, Abstract= {In this work, we tackle the general topic of Machine Learning Safety from four different angles: robustness, anomaly detection, alignment, and systemic safety. Concretely, we introduce PixMix to comprehensively improve performance on robustness, calibration, consistency, and monitoring. We curate the Species dataset for large-scale anomaly detection. We create the Jiminy Cricket game environments to measure ML agent's understanding of and execution according to morality. We collect a large suite of emotionally evocative videos to show traction on preference learning. Additionally, we curate the MMLU benchmark to measure large language models' knowledge across 57 different domains and a forecasting benchmark to measure their ability to predict future trends and events.}, }
EndNote citation:
%0 Thesis %A Zou, Jiaming %T Machine Learning Safety %I EECS Department, University of California, Berkeley %D 2022 %8 May 19 %@ UCB/EECS-2022-147 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-147.html %F Zou:EECS-2022-147