Sayna Ebrahimi

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2020-82

May 28, 2020

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-82.pdf

Artificial neural networks have exceeded human level performance in accomplishing several individual tasks (e.g. voice recognition, object recognition, and video games). However, such success remains modest compared to human intelligence that can learn and perform an unlimited number of tasks. Humans' ability of learning and accumulating knowledge over their lifetime is an essential aspect of their intelligence. In this respect, continual machine learning aims at a higher level of machine intelligence through providing the artificial agents with the ability to learn online from a nonstationary and never-ending stream of data. A key component of such a never-ending learning process is to overcome the catastrophic forgetting of previously seen data, a problem that neural networks are well known to suffer from. The work described in this thesis has been dedicated to the investigation of continual learning and solutions to mitigate the forgetting phenomena in two common types of neural networks: Bayesian and non-Bayesian frameworks. We assume a task incremental setting where tasks arrive one at a time with distinct boundaries.

First, we start by building an evolving system where the capacity dynamically increases to accommodate new tasks without compromising scalability. We do so by developing a shared knowledge among tasks while learning features unique to each one. To further ensure preventing forgetting we use small episodic memory containing few samples from old tasks. We show this approach for non-Bayesian neural networks without loss of generality and applicability to Bayesian neural networks.

Second, unique to Bayesian neural networks, as an alternative to dynamically grow the architecture and store the old data, which might not be feasible if not impossible due to confidentiality issues, important parameters in a model can be identified and future changes on them get regularized. We consider a fixed network capacity where each parameter is a distribution rather than a real-valued number. We leverage the uncertainty defined per parameters in Bayesian setting to guide the continual learning process in determining the important parameters; the more certain a parameter is, the less we want it to alter in favor of learning new tasks. We show a simple yet effective regularization technique where the learning rate of parameters is inversely scaled with their uncertainty.

The proposed methods in this thesis have tackled important aspects of continual learning. They are evaluated on different benchmarks and over various learning sequences. Advances in the state of the art of continual learning have been shown and challenges for bringing continual learning into application were critically identified.

Advisors: Trevor Darrell


BibTeX citation:

@mastersthesis{Ebrahimi:EECS-2020-82,
    Author= {Ebrahimi, Sayna},
    Title= {Continual Learning with Neural Networks},
    School= {EECS Department, University of California, Berkeley},
    Year= {2020},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-82.html},
    Number= {UCB/EECS-2020-82},
    Abstract= {Artificial neural networks have exceeded human level performance in accomplishing several individual tasks (e.g. voice recognition, object recognition, and video games). However, such success remains modest compared to human intelligence that can learn and perform an unlimited number of tasks. Humans' ability of learning and accumulating knowledge over their lifetime is an essential aspect of their intelligence. In this respect, continual machine learning aims at a higher level of machine intelligence through providing the artificial agents with the ability to learn online from a nonstationary and never-ending stream of data. A key component of such a never-ending learning process is to overcome the catastrophic forgetting of previously seen data, a problem that neural networks are well known to suffer from. The work described in this thesis has been dedicated to the investigation of continual learning and solutions to mitigate the forgetting phenomena in two common types of neural networks: Bayesian and non-Bayesian frameworks. We assume a task incremental setting where tasks arrive one at a time with distinct boundaries.

First, we start by building an evolving system where the capacity dynamically increases to accommodate new tasks without compromising scalability. We do so by developing a shared knowledge among tasks while learning features unique to each one. To further ensure preventing forgetting we use small episodic memory containing few samples from old tasks. We show this approach for non-Bayesian neural networks without loss of generality and applicability to Bayesian neural networks.

Second, unique to Bayesian neural networks, as an alternative to dynamically grow the architecture and store the old data, which might not be feasible if not impossible due to confidentiality issues, important parameters in a model can be identified and future changes on them get regularized. We consider a fixed network capacity where each parameter is a distribution rather than a real-valued number. We leverage the uncertainty defined per parameters in Bayesian setting to guide the continual learning process in determining the important parameters; the more certain a parameter is, the less we want it to alter in favor of learning new tasks. We show a simple yet effective regularization technique where the learning rate of parameters is inversely scaled with their uncertainty. 

The proposed methods in this thesis have tackled important aspects of continual learning. They are evaluated on different benchmarks and over various learning sequences. Advances in the state of the art of continual learning have been shown and challenges for bringing continual learning into application were critically identified.},
}

EndNote citation:

%0 Thesis
%A Ebrahimi, Sayna 
%T Continual Learning with Neural Networks
%I EECS Department, University of California, Berkeley
%D 2020
%8 May 28
%@ UCB/EECS-2020-82
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-82.html
%F Ebrahimi:EECS-2020-82