Communication-Efficient Federated Learning with Sketching

Ashwinee Panda, Daniel Rothchild, Enayat Ullah and Nikita Ivkin

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2020-57
May 23, 2020

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-57.pdf

Existing approaches to federated learning suffer from a communication bottleneck as well as convergence issues due to sparse client participation. In this paper we introduce a novel algorithm, called FedSketchedSGD, to overcome these challenges. FedSketchedSGD compresses model updates using a Count Sketch, and then takes advantage of the mergeability of sketches to combine model updates from many workers. A key insight in the design of FedSketchedSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch. This allows the algorithm to move momentum and error accumulation from clients to the central aggregator, overcoming the challenges of sparse client participation while still achieving high compression rates. We prove that FedSketchedSGD has favorable convergence guarantees, and we demonstrate its empirical effectiveness by training two residual networks and a transformer model.

Advisor: Raluca Ada Popa and Joseph Gonzalez


BibTeX citation:

@mastersthesis{Panda:EECS-2020-57,
    Author = {Panda, Ashwinee and Rothchild, Daniel and Ullah, Enayat and Ivkin, Nikita},
    Editor = {Gonzalez, Joseph and Arora, Raman and Braverman, Vladimir and Stoica, Ion},
    Title = {Communication-Efficient Federated Learning with Sketching},
    School = {EECS Department, University of California, Berkeley},
    Year = {2020},
    Month = {May},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-57.html},
    Number = {UCB/EECS-2020-57},
    Abstract = {Existing approaches to federated learning suffer from a communication bottleneck as well as convergence issues due to sparse client participation. In this paper we introduce a novel algorithm, called FedSketchedSGD, to overcome these challenges. FedSketchedSGD compresses model updates using a Count Sketch, and then takes advantage of the mergeability of sketches to combine model updates from many workers. A key insight in the design of FedSketchedSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch. This allows the algorithm to move momentum and error accumulation from clients to the central aggregator, overcoming the challenges of sparse client participation while still achieving high compression rates. We prove that FedSketchedSGD has favorable convergence guarantees, and we demonstrate its empirical effectiveness by training two residual networks and a transformer model.}
}

EndNote citation:

%0 Thesis
%A Panda, Ashwinee
%A Rothchild, Daniel
%A Ullah, Enayat
%A Ivkin, Nikita
%E Gonzalez, Joseph
%E Arora, Raman
%E Braverman, Vladimir
%E Stoica, Ion
%T Communication-Efficient Federated Learning with Sketching
%I EECS Department, University of California, Berkeley
%D 2020
%8 May 23
%@ UCB/EECS-2020-57
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-57.html
%F Panda:EECS-2020-57