KDD'24 Tutorial on RandNLA for ML

Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning

Organizers: Michał Dereziński (University of Michigan) and Michael W. Mahoney (ICSI, LBNL and UC Berkeley)

Large matrices arise in many machine learning and data analysis applications, including as representations of datasets, graphs, model weights, and first and second-order derivatives. Randomized Numerical Linear Algebra (RandNLA) is an area which uses randomness to develop improved algorithms for ubiquitous matrix problems. The area has reached a certain level of maturity; but recent hardware trends, efforts to incorporate RandNLA algorithms into core numerical libraries, and advances in machine learning, statistics, and random matrix theory, have lead to new theoretical and practical challenges. This article provides a self-contained overview of RandNLA, in light of these developments.

Where: KDD'24, Barcelona, Spain
When: Sunday, August 25, 2024, at 10am-1pm CET

Materials: survey paper [1] and slides

Previous edition: NeurIPS'23


Overview

Matrices provide a natural structure with which to model data. For example, an m x n matrix A can encode information about m objects, each of which is described by n features. Alternatively, an n x n positive definite matrix A can encode correlations/similarities between all pairs of n objects. Motivated by large-scale data problems, recent years have witnessed many exciting developments in the theory and practice of matrix algorithms. Particularly remarkable is the use of randomization. Historically, in statistics, machine learning (ML), and domain sciences, randomization has been assumed to be a property of the input data, e.g., due to noise in the data generation mechanisms. In this more recent work on randomization, it is used as an algorithmic or computational resource.

Randomized Numerical Linear Algebra (RandNLA) is an interdisciplinary research area that exploits randomization as a computational resource to develop improved algorithms for large-scale linear algebra problems. From a foundational perspective, it has roots in theoretical computer science (TCS), deep connections with convex analysis, probability theory, and metric embedding theory, etc., as well as strong connections with scientific computing, signal processing, and numerical linear algebra (NLA). From an implementational perspective, well-engineered RandNLA algorithms beat highly-optimized software libraries for ubiquitous problems such as very over-determined least-squares, they scale well to parallel/distributed environments, and they beat state-of-the-art for a wide range of low-rank matrix approximation problems. From a data analysis perspective, RandNLA has strong connections with ML and statistics and many ''non-methodological'' applications of data analysis. More generally, of course, it is of continued importance since there is a growing interest in providing an algorithmic and statistical foundation for modern large-scale data analysis.

The area of RandNLA has achieved a certain level of maturity. As such, there are multiple reviews of the area from multiple different perspectives: introductory overviews (light on prerequisites) [1-3]; broad and proof-heavy resources [4-6]; perspectives on interdisciplinary theory (light on proofs) [7,8]; deep investigations of specific disciplinary topics [9-12]; and approaches to high-quality software implementations [13]. Particularly notable is the current effort of incorporating RandNLA algorithms into the core numerical libraries (e.g., RandLAPACK and RandBLAS; see [13]) that lie at the foundation of virtually all computational tools in ML (and scientific computing and beyond).

This level of maturity, as well as recent demands by the ML community and recent trends in hardware, lead to new theoretical and practical challenges that did not exist a decade ago. For example: developing RandLAPACK and RandBLAS leads to new algorithmic and theoretical abstractions, different than those present in TCS or NLA, and different than those common in statistics and ML; recent developments in neural network training highlight important trade-offs between communication and computation, between forward accuracy and parameter stability, etc.; and recent developments in hardware have led to new trade-offs between latency and throughput, both at the model training and model inference stage. To complement these challenges, recent theoretical developments in Random Matrix Theory (RMT) have provided finer quantitative performance analysis than was possible with traditional RandNLA analysis tools. These theoretical developments come from RMT as well as from RandNLA itself, and they permit both finer analysis of existing methods as well as the possibility to develop novel methods that better bridge the theory-practice~gap.

In this tutorial, we will provide a self-contained review/overview of RandNLA, in light of these recent developments, describing and highlighting upcoming and future trends. We will introduce the foundations of RandNLA and matrix sketching, with a particular focus on applications in ML and stochastic optimization, followed by an overview of recent developments in using sketching methods to gain stronger control on the convergence and generalization properties of stochastic algorithms. We will cover both the theoretical aspects of these techniques, as well as their applications in the context of important ML and data science research topics. Thus, our discussion will be relevant not only to theoretical researchers who wish to learn the latest advances in RandNLA, but also to a general audience of ML and data science researchers and practitioners, who want to incorporate RandNLA methods into their large-scale data problems.


References

[1]  Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning
M. Dereziński, M. W. Mahoney. 2024
Proc. Conference on Knowledge Discovery and Data Mining (KDD'24)  PDF

[2]  RandNLA: Randomized Numerical Linear Algebra
P. Drineas and M. W. Mahoney. 2016.
Commun. ACM 59 (2016), 80–90.  PDF

[3]  Lectures on Randomized Numerical Linear Algebra.
P. Drineas and M. W. Mahoney. 2018.
In The Mathematics of Data. AMS/IAS/SIAM, 1–48.  PDF

[4]  Lecture Notes on Randomized Linear Algebra.
M. W. Mahoney. 2016.
Technical Report Preprint: arXiv:1608.04481.  PDF

[5]  An introduction to matrix concentration inequalities.
J. A. Tropp. 2015.
NOW Publishers, Boston.  PDF

[6]  Sketching as a tool for numerical linear algebra.
D. P. Woodruff. 2014.
Foundations and Trends in Theoretical Computer Science 10, 1–2 (2014), 1–157.  PDF

[7]  Determinantal Point Processes in Randomized Numerical Linear Algebra.
M. Dereziński and M. W. Mahoney. 2021.
Notices of the American Mathematical Society 68, 1 (2021), 34–45.  PDF

[8]  Randomized algorithms for matrices and data.
M. W. Mahoney. 2011.
NOW Publishers, Boston.  PDF

[9]  Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions.
N. Halko, P.-G. Martinsson, and J. A. Tropp. 2011.
SIAM review 53, 2 (2011), 217–288.  PDF

[10]  Randomized algorithms in numerical linear algebra.
R. Kannan and S. Vempala. 2017.
Acta Mathematica 26 (2017), 95–135.  PDF

[11]  Randomized methods for matrix computations.
P.-G. Martinsson. 2018.
In The Mathematics of Data. AMS/IAS/SIAM, 187–230.  PDF

[12]  Randomized numerical linear algebra: Foundations and algorithms.
P.-G. Martinsson and J. A. Tropp. 2020.
Acta Numerica 29 (2020), 403–572.  PDF

[13]  Randomized Numerical Linear Algebra – A Perspective on the Field with an Eye to Software.
R. Murray, J. Demmel, M. W. Mahoney, N. B. Erichson, M. Melnichenko, O. A. Malik, L. Grigori, P. Luszczek, M. Dereziński, M. E. Lopes, T. Liang, H. Luo, and J. Dongarra. 2023.
Technical Report Preprint: arXiv:2302.11474v2.  PDF