Lifted Neural Networks
Armin Askari and Geoff Negiar and Rajiv Sambarhya and Laurent El Ghaoui
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2021-218
October 8, 2021
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-218.pdf
We describe a novel family of models of multilayer feedforward neural networks in which the activation functions are encoded via penalties in the training problem. Our approach is based on representing a non-decreasing activation function as the argmin of an appropriate convex optimization problem. The new framework allows for algorithms such as block-coordinate descent methods to be applied, in which each step is composed of a simple (no hidden layer) supervised learning problem that is parallelizable across data points and/or layers. Experiments indicate that the proposed models provide excellent initial guesses for weights for standard neural networks. In addition, the model provides avenues for interesting extensions, such as robustness against noisy inputs and optimizing over parameters in activation functions.
BibTeX citation:
@techreport{Askari:EECS-2021-218, Author= {Askari, Armin and Negiar, Geoff and Sambarhya, Rajiv and El Ghaoui, Laurent}, Title= {Lifted Neural Networks}, Year= {2021}, Month= {Oct}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-218.html}, Number= {UCB/EECS-2021-218}, Abstract= {We describe a novel family of models of multilayer feedforward neural networks in which the activation functions are encoded via penalties in the training problem. Our approach is based on representing a non-decreasing activation function as the argmin of an appropriate convex optimization problem. The new framework allows for algorithms such as block-coordinate descent methods to be applied, in which each step is composed of a simple (no hidden layer) supervised learning problem that is parallelizable across data points and/or layers. Experiments indicate that the proposed models provide excellent initial guesses for weights for standard neural networks. In addition, the model provides avenues for interesting extensions, such as robustness against noisy inputs and optimizing over parameters in activation functions.}, }
EndNote citation:
%0 Report %A Askari, Armin %A Negiar, Geoff %A Sambarhya, Rajiv %A El Ghaoui, Laurent %T Lifted Neural Networks %I EECS Department, University of California, Berkeley %D 2021 %8 October 8 %@ UCB/EECS-2021-218 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-218.html %F Askari:EECS-2021-218