Ashwin Ganesh

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2023-150

May 12, 2023

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-150.pdf

Deep neural networks excel on a variety of different tasks, often surpassing human abilities. However, when presented with out-of-distribution data, these models tend to break down even on the simplest tasks. This paper compares the robustness of implicitly-defined and classical deep learning models on a series of mathematical tasks and a real-world earthquake location prediction task, where the models are tested with out-of-distribution samples during inference time. Across all experiments, implicit models greatly outperform classical deep learning networks that overfit the training distribution. This paper then shows how to decrease implicit model training time by harnessing the state-driven implicit modeling framework to safely eliminate features while maintaining model accuracy. Safe feature elimination is demonstrated with the FashionMNIST dataset and earthquake location prediction offering a promising avenue for the adoption of implicit models.

Advisors: Laurent El Ghaoui


BibTeX citation:

@mastersthesis{Ganesh:EECS-2023-150,
    Author= {Ganesh, Ashwin},
    Editor= {El Ghaoui, Laurent and Atamtürk, Alper},
    Title= {Towards Efficient and Robust Out-of-Distribution Deep Learning with Implicit Models},
    School= {EECS Department, University of California, Berkeley},
    Year= {2023},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-150.html},
    Number= {UCB/EECS-2023-150},
    Abstract= {Deep neural networks excel on a variety of different tasks, often surpassing human abilities. However, when presented with out-of-distribution data, these models tend to break down even on the simplest tasks. This paper compares the robustness of implicitly-defined and classical deep learning models on a series of mathematical tasks and a real-world earthquake location prediction task, where the models are tested with out-of-distribution samples during inference time. Across all experiments, implicit models greatly outperform classical deep learning networks that overfit the training distribution. This paper then shows how to decrease implicit model training time by harnessing the state-driven implicit modeling framework to safely eliminate features while maintaining model accuracy. Safe feature elimination is demonstrated with the FashionMNIST dataset and earthquake location prediction offering a promising avenue for the adoption of implicit models.},
}

EndNote citation:

%0 Thesis
%A Ganesh, Ashwin 
%E El Ghaoui, Laurent 
%E Atamtürk, Alper 
%T Towards Efficient and Robust Out-of-Distribution Deep Learning with Implicit Models
%I EECS Department, University of California, Berkeley
%D 2023
%8 May 12
%@ UCB/EECS-2023-150
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-150.html
%F Ganesh:EECS-2023-150