NDSGD: A Practical Method to Improve Robustness of Deep Learning Model on Noisy Dataset

Zhi Chen

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2020-55
May 22, 2020

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-55.pdf

Noisy labels bring new challenges to deep learning. There are many data sources available for people to download on the web, but they tend to contain inaccurate labels. Training on these datasets causes performance degradation because deep neural network may remember the label noise easily. Existing solutions include constructing noise elimination algorithms to separate noisy labels or proposing noise-robust algorithms to learn directly from noisy labels, but the intrinsic mechanisms and the scalability of deep neural network remain not well designed. This paper proposes a novel approach called Noisy Dataset Stochastic Gradient Descent (NDSGD), which aims to optimize each step (i.e., noisy data clipping, carry out groups and add robustness factors) of stochastic gradient descent to improve the robustness of deep learning models. The experimental results on MNIST, NEWS and CIFAR-10 demonstrate that NDSGD is superior for noisy dataset and can make the deep learning model robust in noisy environment while maintain high accuracy.

Advisor: Dawn Song


BibTeX citation:

@mastersthesis{Chen:EECS-2020-55,
    Author = {Chen, Zhi},
    Title = {NDSGD: A Practical Method to Improve Robustness of Deep Learning Model on Noisy Dataset},
    School = {EECS Department, University of California, Berkeley},
    Year = {2020},
    Month = {May},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-55.html},
    Number = {UCB/EECS-2020-55},
    Abstract = {Noisy labels bring new challenges to deep learning. There are many data sources available for people to download on the web, but they tend to contain inaccurate labels. Training on these datasets causes performance degradation because deep neural network may remember the label noise easily. Existing solutions include constructing noise elimination algorithms to separate noisy labels or proposing noise-robust algorithms to learn directly from noisy labels, but the intrinsic mechanisms and the scalability of deep neural network remain not well designed.
This paper proposes a novel approach called Noisy Dataset Stochastic Gradient Descent (NDSGD), which aims to optimize each step (i.e., noisy data clipping, carry out groups and add robustness factors) of stochastic gradient descent to improve the robustness of deep learning models. The experimental results on MNIST, NEWS and CIFAR-10 demonstrate that NDSGD is superior for noisy dataset and can make the deep learning model robust in noisy environment while maintain high accuracy.}
}

EndNote citation:

%0 Thesis
%A Chen, Zhi
%T NDSGD: A Practical Method to Improve Robustness of Deep Learning Model on Noisy Dataset
%I EECS Department, University of California, Berkeley
%D 2020
%8 May 22
%@ UCB/EECS-2020-55
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-55.html
%F Chen:EECS-2020-55