Fangda Gu

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2021-249

December 1, 2021

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-249.pdf

Deep implicit models are very recent developments on deep learning. Traditionally, deep learning methods rely on explicit forward feeding structures. Super deep structures are proposed to give better performance in various domains. Such approaches have posed difficulty in theoretical analysis and under perform shallower models in some domains. By introducing a recursive structure involving solution to an equilibrium equation in the forward feeding, implicit deep models capture the idea of infinitely deep neural networks while preserving simplicity in model representation, allowing theoretical analysis and better connection to previous efforts in math and control communities. Recent works on implicit models have demonstrated state-of-the-art empirical performances.

Despite great ambition, implicit models are very new and see theoretical and empirical challenges. From the theoretical aspect, efficient and effective training and evaluation of implicit models are still open problems. The naive training methods for implicit models are highly inefficient. Trivial initialization easily violates the validity conditions for implicit models. Robustness of implicit models are not well studied. From the empirical aspect, there are still a limited number of works on applying implicit models to solve real world problems. Such works also have not demonstrated significant performance boost over deep learning in general. Applications of implicit models in most areas are mostly unexplored even though implicit models fit in the deep learning framework easily.

In the dissertation, we introduce our theoretical and empirical contributions on deep implicit models. The presentation of the dissertation is split into two parts. The first part focuses on theoretical foundations for deep implicit models where we research the evaluation, training, and other topics for implicit deep learning and deep learning in general. The second part then explores the applications of deep implicit models and corresponding theories on real world machine learning applications. We show that implicit models can out-perform existing deep learning techniques in a set of tasks, thanks to the implicit structure which resembles infinitely deep neural networks.

Advisors: Laurent El Ghaoui


BibTeX citation:

@phdthesis{Gu:EECS-2021-249,
    Author= {Gu, Fangda},
    Title= {Implicit Models: Theories and Applications},
    School= {EECS Department, University of California, Berkeley},
    Year= {2021},
    Month= {Dec},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-249.html},
    Number= {UCB/EECS-2021-249},
    Abstract= {Deep implicit models are very recent developments on deep learning. Traditionally, deep learning methods rely on explicit forward feeding structures. Super deep structures are proposed to give better performance in various domains. Such approaches have posed difficulty in theoretical analysis and under perform shallower models in some domains. By introducing a recursive structure involving solution to an equilibrium equation in the forward feeding, implicit deep models capture the idea of infinitely deep neural networks while preserving simplicity in model representation, allowing theoretical analysis and better connection to previous efforts in math and control communities. Recent works on implicit models have
demonstrated state-of-the-art empirical performances.

Despite great ambition, implicit models are very new and see theoretical and empirical challenges. From the theoretical aspect, efficient and effective training and evaluation of
implicit models are still open problems. The naive training methods for implicit models are highly inefficient. Trivial initialization easily violates the validity conditions for implicit models. Robustness of implicit models are not well studied. From the empirical aspect, there are still a limited number of works on applying implicit models to solve real world problems. Such works also have not demonstrated significant performance boost over deep learning in general. Applications of implicit models in most areas are mostly unexplored even though implicit models fit in the deep learning framework easily.

In the dissertation, we introduce our theoretical and empirical contributions on deep implicit models. The presentation of the dissertation is split into two parts. The first part focuses on theoretical foundations for deep implicit models where we research the evaluation, training,
and other topics for implicit deep learning and deep learning in general. The second part then explores the applications of deep implicit models and corresponding theories on real world machine learning applications. We show that implicit models can out-perform existing deep learning techniques in a set of tasks, thanks to the implicit structure which resembles infinitely deep neural networks.},
}

EndNote citation:

%0 Thesis
%A Gu, Fangda 
%T Implicit Models: Theories and Applications
%I EECS Department, University of California, Berkeley
%D 2021
%8 December 1
%@ UCB/EECS-2021-249
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-249.html
%F Gu:EECS-2021-249