Lifted Neural Networks | EECS at UC Berkeley

Armin Askari and Geoff Negiar and Rajiv Sambarhya and Laurent El Ghaoui

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2021-218

October 8, 2021

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-218.pdf

We describe a novel family of models of multilayer feedforward neural networks in which the activation functions are encoded via penalties in the training problem. Our approach is based on representing a non-decreasing activation function as the argmin of an appropriate convex optimization problem. The new framework allows for algorithms such as block-coordinate descent methods to be applied, in which each step is composed of a simple (no hidden layer) supervised learning problem that is parallelizable across data points and/or layers. Experiments indicate that the proposed models provide excellent initial guesses for weights for standard neural networks. In addition, the model provides avenues for interesting extensions, such as robustness against noisy inputs and optimizing over parameters in activation functions.

BibTeX citation:

@techreport{Askari:EECS-2021-218,
    Author= {Askari, Armin and Negiar, Geoff and Sambarhya, Rajiv and El Ghaoui, Laurent},
    Title= {Lifted Neural Networks},
    Year= {2021},
    Month= {Oct},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-218.html},
    Number= {UCB/EECS-2021-218},
    Abstract= {We describe a novel family of models of multilayer feedforward neural networks in which the activation functions are encoded via penalties in the training problem. Our approach is based on representing a non-decreasing activation function as the argmin of an appropriate convex optimization problem. The new framework allows for algorithms such as block-coordinate descent methods
to be applied, in which each step is composed of
a simple (no hidden layer) supervised learning
problem that is parallelizable across data points
and/or layers. Experiments indicate that the proposed models provide excellent initial guesses for weights for standard neural networks. In addition, the model provides avenues for interesting extensions, such as robustness against noisy inputs and optimizing over parameters in activation
functions.},
}

EndNote citation:

%0 Report
%A Askari, Armin 
%A Negiar, Geoff 
%A Sambarhya, Rajiv 
%A El Ghaoui, Laurent 
%T Lifted Neural Networks
%I EECS Department, University of California, Berkeley
%D 2021
%8 October 8
%@ UCB/EECS-2021-218
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-218.html
%F Askari:EECS-2021-218