Randomized Numerical Linear Algebra: A Perspective on the Field With an Eye to Software

Riley Murray, James Demmel, Michael W. Mahoney, N. Benjamin Erichson, Maksim Melnichenko, Osman Asif Malik, Laura Grigori, Piotr Luszczek, Michal Derezinski, Miles E. Lopes, Tianyu Liang, Hengrui Luo and Jack Dongarra

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2023-19
February 22, 2023

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-19.pdf

Randomized numerical linear algebra – RandNLA, for short – concerns the use of randomization as a resource to develop improved algorithms for large-scale linear algebra computations. The origins of contemporary RandNLA lay in theoretical computer science, where it blossomed from a simple idea: randomization provides an avenue for computing approximate solutions to linear algebra problems more efficiently than deterministic algorithms. This idea proved fruitful in and was largely driven by the development of scalable algorithms for machine learning and statistical data analysis applications. However, the true potential of RandNLA only came into focus once it began to integrate with the fields of numerical analysis and “classical” numerical linear algebra. Through the efforts of many individuals, randomized algorithms have been developed that provide full control over the accuracy of their solutions and that can be every bit as reliable as algorithms that might be found in libraries such as LAPACK.

The spectrum of possibilities offered by RandNLA has created a virtuous cycle of contributions by numerical analysts, statisticians, theoretical computer scientists, and the machine learning community. Recent years have even seen the incorporation of certain RandNLA methods into MATLAB, the NAG Library, and NVIDIA’s cuSOLVER. In view of these developments, we believe the time is ripe to accelerate the adoption of RandNLA in the scientific community. In particular, we believe the community stands to benefit significantly from a suitably defined “RandBLAS” and “RandLAPACK,” to serve as standard libraries for RandNLA, in much the same way that BLAS and LAPACK serve as standards for deterministic linear algebra.

This monograph surveys the field of RandNLA as a step toward building meaningful RandBLAS and RandLAPACK libraries. Section 1 primes the reader for a dive into the field and summarizes this monograph’s content at multiple levels of detail. Section 2 focuses on RandBLAS, which is to be responsible for sketching. Details of functionality suitable for RandLAPACK are covered in the five sections that follow. Specifically, Sections 3 to 5 cover least squares and optimization, low-rank approximation, and other select problems that are well-understood in how they benefit from randomized algorithms. The remaining sections – on statistical leverage scores (Section 6) and tensor computations (Section 7) – read more like traditional surveys. The different flavor of these latter sections reflects how, in our assessment, the literature on these topics is still maturing. We provide a substantial amount of pseudo-code and supplementary material over the course of five appendices. Much of the pseudo-code has been tested via publicly available Matlab and Python implementations.

Note: this EECS Technical Report replaces EECS-2022-258.


BibTeX citation:

@techreport{Murray:EECS-2023-19,
    Author = {Murray, Riley and Demmel, James and Mahoney, Michael W. and Erichson, N. Benjamin and Melnichenko, Maksim and Malik, Osman Asif and Grigori, Laura and Luszczek, Piotr and Derezinski, Michal and Lopes, Miles E. and Liang, Tianyu and Luo, Hengrui and Dongarra, Jack},
    Title = {Randomized Numerical Linear Algebra: A Perspective on the Field With an Eye to Software},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2023},
    Month = {Feb},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-19.html},
    Number = {UCB/EECS-2023-19},
    Abstract = {Randomized numerical linear algebra – RandNLA, for short – concerns the use of randomization as a resource to develop improved algorithms for large-scale linear algebra computations. The origins of contemporary RandNLA lay in theoretical computer science, where it blossomed from a simple idea: randomization provides an avenue for computing approximate solutions to linear algebra problems more efficiently than deterministic algorithms. This idea proved fruitful in and was largely driven by the development of scalable algorithms for machine learning and statistical data analysis applications. However, the true potential of RandNLA only came into focus once it began to integrate with the fields of numerical analysis and “classical” numerical linear algebra. Through the efforts of many individuals, randomized algorithms have been developed that provide full control over the accuracy of their solutions and that can be every bit as reliable as algorithms that might be found in libraries such as LAPACK.

The spectrum of possibilities offered by RandNLA has created a virtuous cycle of contributions by numerical analysts, statisticians, theoretical computer scientists, and the machine learning community. Recent years have even seen the incorporation of certain RandNLA methods into MATLAB, the NAG Library, and NVIDIA’s cuSOLVER. In view of these developments, we believe the time is ripe to accelerate the adoption of RandNLA in the scientific community. In particular, we believe the community stands to benefit significantly from a suitably defined “RandBLAS” and “RandLAPACK,” to serve as standard libraries for RandNLA, in much the same way that BLAS and LAPACK serve as standards for deterministic linear algebra.

This monograph surveys the field of RandNLA as a step toward building meaningful RandBLAS and RandLAPACK libraries. Section 1 primes the reader for a dive into the field and summarizes this monograph’s content at multiple levels of detail. Section 2 focuses on RandBLAS, which is to be responsible for sketching. Details of functionality suitable for RandLAPACK are covered in the five sections that follow. Specifically, Sections 3 to 5 cover least squares and optimization, low-rank approximation, and other select problems that are well-understood in how they benefit from randomized algorithms. The remaining sections – on statistical leverage scores (Section 6) and tensor computations (Section 7) – read more like traditional surveys. The different flavor of these latter sections reflects how, in our assessment, the literature on these topics is still maturing.
We provide a substantial amount of pseudo-code and supplementary material over the course of five appendices. Much of the pseudo-code has been tested via publicly available Matlab and Python implementations.

Note: this EECS Technical Report replaces EECS-2022-258.}
}

EndNote citation:

%0 Report
%A Murray, Riley
%A Demmel, James
%A Mahoney, Michael W.
%A Erichson, N. Benjamin
%A Melnichenko, Maksim
%A Malik, Osman Asif
%A Grigori, Laura
%A Luszczek, Piotr
%A Derezinski, Michal
%A Lopes, Miles E.
%A Liang, Tianyu
%A Luo, Hengrui
%A Dongarra, Jack
%T Randomized Numerical Linear Algebra: A Perspective on the Field With an Eye to Software
%I EECS Department, University of California, Berkeley
%D 2023
%8 February 22
%@ UCB/EECS-2023-19
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-19.html
%F Murray:EECS-2023-19