The Cost of Parallelizing Boosting
Xin Lyu and Hongxun Wu and Junzhao Yang
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2024-207
December 4, 2024
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-207.pdf
We study the cost of parallelizing weak-to-strong boosting algorithms for learning, following the recent work of Karbasi and Larsen. Our main results are two-fold: - First, we prove a tight lower bound, showing that even "slight" parallelization of boosting requires an exponential blow-up in the complexity of training. Specifically, let γ be the weak learner's advantage over random guessing. The famous AdaBoost algorithm produces an accurate hypothesis by interacting with the weak learner for Õ (1/γ^2) rounds where each round runs in polynomial time. Karbasi and Larsen showed that "significant" parallelization must incur exponential blow-up: Any boosting algorithm either interacts with the weak learner for Ω(1/γ) rounds or incurs an exp(d/γ) blow-up in the complexity of training, where d is the VC dimension of the hypothesis class. We close the gap by showing that any boosting algorithm either has Ω(1/γ^2) rounds of interaction or incurs a smaller exponential blow-up of exp(d). -Complementing our lower bound, we show that there exists a boosting algorithm using Õ (1/(tγ^2)) rounds, and only suffer a blow-up of exp(d⋅t^2). Plugging in t=ω(1), this shows that the smaller blow-up in our lower bound is tight. More interestingly, this provides the first trade-off between the parallelism and the total work required for boosting.
Advisors: Jelani Nelson and Avishay Tal
BibTeX citation:
@mastersthesis{Lyu:EECS-2024-207, Author= {Lyu, Xin and Wu, Hongxun and Yang, Junzhao}, Title= {The Cost of Parallelizing Boosting}, School= {EECS Department, University of California, Berkeley}, Year= {2024}, Month= {Dec}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-207.html}, Number= {UCB/EECS-2024-207}, Abstract= {We study the cost of parallelizing weak-to-strong boosting algorithms for learning, following the recent work of Karbasi and Larsen. Our main results are two-fold: - First, we prove a tight lower bound, showing that even "slight" parallelization of boosting requires an exponential blow-up in the complexity of training. Specifically, let γ be the weak learner's advantage over random guessing. The famous AdaBoost algorithm produces an accurate hypothesis by interacting with the weak learner for Õ (1/γ^2) rounds where each round runs in polynomial time. Karbasi and Larsen showed that "significant" parallelization must incur exponential blow-up: Any boosting algorithm either interacts with the weak learner for Ω(1/γ) rounds or incurs an exp(d/γ) blow-up in the complexity of training, where d is the VC dimension of the hypothesis class. We close the gap by showing that any boosting algorithm either has Ω(1/γ^2) rounds of interaction or incurs a smaller exponential blow-up of exp(d). -Complementing our lower bound, we show that there exists a boosting algorithm using Õ (1/(tγ^2)) rounds, and only suffer a blow-up of exp(d⋅t^2). Plugging in t=ω(1), this shows that the smaller blow-up in our lower bound is tight. More interestingly, this provides the first trade-off between the parallelism and the total work required for boosting.}, }
EndNote citation:
%0 Thesis %A Lyu, Xin %A Wu, Hongxun %A Yang, Junzhao %T The Cost of Parallelizing Boosting %I EECS Department, University of California, Berkeley %D 2024 %8 December 4 %@ UCB/EECS-2024-207 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-207.html %F Lyu:EECS-2024-207