Machine Learning Safety | EECS at UC Berkeley

Jiaming Zou

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2022-147

May 19, 2022

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-147.pdf

In this work, we tackle the general topic of Machine Learning Safety from four different angles: robustness, anomaly detection, alignment, and systemic safety. Concretely, we introduce PixMix to comprehensively improve performance on robustness, calibration, consistency, and monitoring. We curate the Species dataset for large-scale anomaly detection. We create the Jiminy Cricket game environments to measure ML agent's understanding of and execution according to morality. We collect a large suite of emotionally evocative videos to show traction on preference learning. Additionally, we curate the MMLU benchmark to measure large language models' knowledge across 57 different domains and a forecasting benchmark to measure their ability to predict future trends and events.

Advisors: Dawn Song

BibTeX citation:

@mastersthesis{Zou:EECS-2022-147,
    Author= {Zou, Jiaming},
    Title= {Machine Learning Safety},
    School= {EECS Department, University of California, Berkeley},
    Year= {2022},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-147.html},
    Number= {UCB/EECS-2022-147},
    Abstract= {In this work, we tackle the general topic of Machine Learning Safety from four different angles: robustness, anomaly detection, alignment, and systemic safety. Concretely, we introduce PixMix to comprehensively improve performance on robustness, calibration, consistency, and monitoring. We curate the Species dataset for large-scale anomaly detection. We create the Jiminy Cricket game environments to measure ML agent's understanding of and execution according to morality. We collect a large suite of emotionally evocative videos to show traction on preference learning. Additionally, we curate the MMLU benchmark to measure large language models' knowledge across 57 different domains and a forecasting benchmark to measure their ability to predict future trends and events.},
}

EndNote citation:

%0 Thesis
%A Zou, Jiaming 
%T Machine Learning Safety
%I EECS Department, University of California, Berkeley
%D 2022
%8 May 19
%@ UCB/EECS-2022-147
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-147.html
%F Zou:EECS-2022-147