The Cost of Parallelizing Boosting

Xin Lyu and Hongxun Wu and Junzhao Yang

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2024-207

December 4, 2024

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-207.pdf

We study the cost of parallelizing weak-to-strong boosting algorithms for learning, following the recent work of Karbasi and Larsen. Our main results are two-fold: - First, we prove a tight lower bound, showing that even "slight" parallelization of boosting requires an exponential blow-up in the complexity of training. Specifically, let γ be the weak learner's advantage over random guessing. The famous AdaBoost algorithm produces an accurate hypothesis by interacting with the weak learner for Õ (1/γ^2) rounds where each round runs in polynomial time. Karbasi and Larsen showed that "significant" parallelization must incur exponential blow-up: Any boosting algorithm either interacts with the weak learner for Ω(1/γ) rounds or incurs an exp(d/γ) blow-up in the complexity of training, where d is the VC dimension of the hypothesis class. We close the gap by showing that any boosting algorithm either has Ω(1/γ^2) rounds of interaction or incurs a smaller exponential blow-up of exp(d). -Complementing our lower bound, we show that there exists a boosting algorithm using Õ (1/(tγ^2)) rounds, and only suffer a blow-up of exp(d⋅t^2). Plugging in t=ω(1), this shows that the smaller blow-up in our lower bound is tight. More interestingly, this provides the first trade-off between the parallelism and the total work required for boosting.

Advisors: Jelani Nelson and Avishay Tal

BibTeX citation:

@mastersthesis{Lyu:EECS-2024-207,
    Author= {Lyu, Xin and Wu, Hongxun and Yang, Junzhao},
    Title= {The Cost of Parallelizing Boosting},
    School= {EECS Department, University of California, Berkeley},
    Year= {2024},
    Month= {Dec},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-207.html},
    Number= {UCB/EECS-2024-207},
    Abstract= {We study the cost of parallelizing weak-to-strong boosting algorithms for learning, following the recent work of Karbasi and Larsen. Our main results are two-fold:
- First, we prove a tight lower bound, showing that even "slight" parallelization of boosting requires an exponential blow-up in the complexity of training.
Specifically, let γ be the weak learner's advantage over random guessing. The famous AdaBoost algorithm produces an accurate hypothesis by interacting with the weak learner for Õ (1/γ^2) rounds where each round runs in polynomial time.
Karbasi and Larsen showed that "significant" parallelization must incur exponential blow-up: Any boosting algorithm either interacts with the weak learner for Ω(1/γ) rounds or incurs an exp(d/γ) blow-up in the complexity of training, where d is the VC dimension of the hypothesis class. We close the gap by showing that any boosting algorithm either has Ω(1/γ^2) rounds of interaction or incurs a smaller exponential blow-up of exp(d).
-Complementing our lower bound, we show that there exists a boosting algorithm using Õ (1/(tγ^2)) rounds, and only suffer a blow-up of exp(d⋅t^2).
Plugging in t=ω(1), this shows that the smaller blow-up in our lower bound is tight. More interestingly, this provides the first trade-off between the parallelism and the total work required for boosting.},
}

EndNote citation:

%0 Thesis
%A Lyu, Xin 
%A Wu, Hongxun 
%A Yang, Junzhao 
%T The Cost of Parallelizing Boosting
%I EECS Department, University of California, Berkeley
%D 2024
%8 December 4
%@ UCB/EECS-2024-207
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-207.html
%F Lyu:EECS-2024-207