Machine Learning Safety
Jiaming Zou
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2022-147
May 19, 2022
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-147.pdf
In this work, we tackle the general topic of Machine Learning Safety from four different angles: robustness, anomaly detection, alignment, and systemic safety. Concretely, we introduce PixMix to comprehensively improve performance on robustness, calibration, consistency, and monitoring. We curate the Species dataset for large-scale anomaly detection. We create the Jiminy Cricket game environments to measure ML agent's understanding of and execution according to morality. We collect a large suite of emotionally evocative videos to show traction on preference learning. Additionally, we curate the MMLU benchmark to measure large language models' knowledge across 57 different domains and a forecasting benchmark to measure their ability to predict future trends and events.
Advisors: Dawn Song
BibTeX citation:
@mastersthesis{Zou:EECS-2022-147,
Author= {Zou, Jiaming},
Title= {Machine Learning Safety},
School= {EECS Department, University of California, Berkeley},
Year= {2022},
Month= {May},
Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-147.html},
Number= {UCB/EECS-2022-147},
Abstract= {In this work, we tackle the general topic of Machine Learning Safety from four different angles: robustness, anomaly detection, alignment, and systemic safety. Concretely, we introduce PixMix to comprehensively improve performance on robustness, calibration, consistency, and monitoring. We curate the Species dataset for large-scale anomaly detection. We create the Jiminy Cricket game environments to measure ML agent's understanding of and execution according to morality. We collect a large suite of emotionally evocative videos to show traction on preference learning. Additionally, we curate the MMLU benchmark to measure large language models' knowledge across 57 different domains and a forecasting benchmark to measure their ability to predict future trends and events.},
}
EndNote citation:
%0 Thesis %A Zou, Jiaming %T Machine Learning Safety %I EECS Department, University of California, Berkeley %D 2022 %8 May 19 %@ UCB/EECS-2022-147 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-147.html %F Zou:EECS-2022-147