Accelerating Randomized Numerical Linear Algebra on a Short-Vector Machine
Jingyi Xu and Borivoje Nikolic and Sophia Shao
EECS Department, University of California, Berkeley
Technical Report No. UCB/
December 1, 2025
In recent years, the field of Randomized Numerical Linear Algebra (RNLA) has gained maturity and attention for its potential to speed up not only classical numerical linear algebra problems but also a wide range of applications including machine learning, statistics, and big data processing. Software libraries such as RandBLAS and RandLAPACK have been developed to implement the randomized algorithms efficiently. Although the randomized algorithms alone already bring orders of magnitude speedup compared with classical linear algebra algorithms, combining randomized algorithms with hardware acceleration techniques creates even more opportunities for performance improvements and energy efficiency savings. The first step towards hardware acceleration for randomized algorithms would be a detailed workload characterization to understand the speedup potential of various kernels in the algorithms.
In this project, we accelerate sketch-and-solve and sketch-and-precondition algorithms on a RISC-V Vector machine. Our manually optimized vector implementation achieves up to a2 60×speedup over its CPU counterpart, with performance gains becoming more pronounced as matrix dimensions increase.
Advisors: Borivoje Nikolic and Sophia Shao
BibTeX citation:
@mastersthesis{Xu:31816, Author= {Xu, Jingyi and Nikolic, Borivoje and Shao, Sophia}, Title= {Accelerating Randomized Numerical Linear Algebra on a Short-Vector Machine}, School= {EECS Department, University of California, Berkeley}, Year= {2025}, Number= {UCB/}, Abstract= {In recent years, the field of Randomized Numerical Linear Algebra (RNLA) has gained maturity and attention for its potential to speed up not only classical numerical linear algebra problems but also a wide range of applications including machine learning, statistics, and big data processing. Software libraries such as RandBLAS and RandLAPACK have been developed to implement the randomized algorithms efficiently. Although the randomized algorithms alone already bring orders of magnitude speedup compared with classical linear algebra algorithms, combining randomized algorithms with hardware acceleration techniques creates even more opportunities for performance improvements and energy efficiency savings. The first step towards hardware acceleration for randomized algorithms would be a detailed workload characterization to understand the speedup potential of various kernels in the algorithms. In this project, we accelerate sketch-and-solve and sketch-and-precondition algorithms on a RISC-V Vector machine. Our manually optimized vector implementation achieves up to a2 60×speedup over its CPU counterpart, with performance gains becoming more pronounced as matrix dimensions increase.}, }
EndNote citation:
%0 Thesis %A Xu, Jingyi %A Nikolic, Borivoje %A Shao, Sophia %T Accelerating Randomized Numerical Linear Algebra on a Short-Vector Machine %I EECS Department, University of California, Berkeley %D 2025 %8 December 1 %@ UCB/ %F Xu:31816