Implicit Learning in Deep Models: Enhancing Extrapolation Power and Sparsity

Alicia Tsai

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2024-214

December 14, 2024

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-214.pdf

This thesis investigates the transformative potential of implicit models in deep learning, with a focus on their capabilities to tackle challenges in extrapolation, sparsity, and robustness. Unlike traditional neural networks that rely on predefined, layer-by-layer architectures, implicit models define outputs through equilibrium equations, enabling dynamic adaptability and compact representations. We present a comprehensive framework for understanding implicit models, encompassing theoretical foundations of well-posedness, algorithms for constrained sparsification, and robustness analyses. The versatility and effectiveness of implicit models are demonstrated across diverse tasks, including mathematical operations, temporal forecasting, and geographical extrapolation, where they consistently outperform non-implicit baselines, particularly under distribution shifts. Key contributions include the introduction of the sensitivity matrix and error bounds, which provide interpretable robustness measurements and facilitate the generation of adversarial attacks. We also highlight depth adaptability and closed-loop feedback as fundamental mechanisms driving the superior extrapolation performance of implicit models. This work establishes implicit models as a robust, scalable, and interpretable alternative paradigm for neural network design, with significant implications for addressing real-world challenges involving noisy, sparse, and out-of-distribution data.

Advisors: Laurent El Ghaoui

BibTeX citation:

@phdthesis{Tsai:EECS-2024-214,
    Author= {Tsai, Alicia},
    Title= {Implicit Learning in Deep Models: Enhancing Extrapolation Power and Sparsity},
    School= {EECS Department, University of California, Berkeley},
    Year= {2024},
    Month= {Dec},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-214.html},
    Number= {UCB/EECS-2024-214},
    Abstract= {This thesis investigates the transformative potential of implicit models in deep learning, with a focus on their capabilities to tackle challenges in extrapolation, sparsity, and robustness. Unlike traditional neural networks that rely on predefined, layer-by-layer architectures, implicit models define outputs through equilibrium equations, enabling dynamic adaptability and compact representations. We present a comprehensive framework for understanding implicit models, encompassing theoretical foundations of well-posedness, algorithms for constrained
sparsification, and robustness analyses. The versatility and effectiveness of implicit models are demonstrated across diverse tasks, including mathematical operations, temporal forecasting, and geographical extrapolation, where they consistently outperform non-implicit baselines, particularly under distribution shifts. Key contributions include the introduction of the sensitivity matrix and error bounds, which provide interpretable robustness measurements and facilitate the generation of adversarial attacks. We also highlight depth adaptability and closed-loop feedback as fundamental mechanisms driving the superior extrapolation
performance of implicit models. This work establishes implicit models as a robust, scalable, and interpretable alternative paradigm for neural network design, with significant implications for addressing real-world challenges involving noisy, sparse, and out-of-distribution data.},
}

EndNote citation:

%0 Thesis
%A Tsai, Alicia 
%T Implicit Learning in Deep Models: Enhancing Extrapolation Power and Sparsity
%I EECS Department, University of California, Berkeley
%D 2024
%8 December 14
%@ UCB/EECS-2024-214
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-214.html
%F Tsai:EECS-2024-214