Michał Dereziński

Email: derezin at umich edu

I am an Assistant Professor of Computer Science and Engineering at the University of Michigan.

Previously, I was a postdoctoral fellow in the Department of Statistics at the University of California, Berkeley, and a research fellow at the Simons Institute for the Theory of Computing. I obtained my Ph.D. in Computer Science at the University of California, Santa Cruz, advised by professor Manfred Warmuth. My research is focused on the theoretical foundations of randomized algorithms for machine learning, optimization, and data science. Prior to UCSC, I completed Master's degrees in mathematics and computer science at the University of Warsaw.

Research interests: Foundations of machine learning and optimization, randomized linear algebra, high-dimensional statistics, random matrix theory


Updates


Teaching

Current PhD students: Jiaming Yang, Sachin Garg
  • Fall 2024 EECS 498-005/598-005 Algorithms for Machine Learning and Data Science (flyer)
  • Winter 2024 EECS 498-005 Algorithms for Data Science
  • Fall 2023 EECS 376 Foundations of Computer Science
  • Winter 2023 EECS 498-005 Algorithms for Data Science
  • Fall 2022 EECS 598-003 Randomized Numerical Linear Algebra in Machine Learning
  • Winter 2022 EECS 545 Machine Learning
  • Fall 2021 EECS 281 Data Structures and Algorithms

Publications

Second-order Information Promotes Mini-Batch Robustness in Variance-Reduced Gradients
S. Garg, A. Berahas, M. Dereziński
(2024)  arXiv

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks
Y. Yang, J. Yang, W. Hu, M. Dereziński
(2024)  arXiv

Solving Dense Linear Systems Faster than via Preconditioning
M. Dereziński, J. Yang
Symposium on Theory of Computing (STOC 2024)  arXiv

Optimal Embedding Dimension for Sparse Subspace Embeddings
S. Chenakkod, M. Dereziński, X. Dong, M. Rudelson
Symposium on Theory of Computing (STOC 2024)  arXiv

Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems
Y. Cho, J. W. Demmel, M. Dereziński, H. Li, H. Luo, M. W. Mahoney, R. J. Murray
(2023)  arXiv

Algorithmic Gaussianization through Sketching: Converting Data into Sub-gaussian Random Designs
M. Dereziński
Conference on Learning Theory (COLT 2023)  arXiv

Randomized Numerical Linear Algebra - A Perspective on the Field with an Eye to Software
R. Murray, J. Demmel, M. W. Mahoney, N. B. Erichson, M. Melnichenko, O. A. Malik, L. Grigori, M. Dereziński, M. E. Lopes, T. Liang, and H. Luo
LAPACK Working Notes (2023)  arXiv

Sharp Analysis of Sketch-and-Project Methods via a Connection to Randomized Singular Value Decomposition
M. Dereziński, E. Rebrova
SIAM Journal on Mathematics of Data Science 6:1 127-153 (2024)  arXiv

Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches
M. Dereziński
NeurIPS Optimization for Machine Learning workshop (OPT 2023)  arXiv

Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence
S. Na, M. Dereziński, M. W. Mahoney
Mathematical Programming 201:473-520 (2023)  PDF  arXiv

Unbiased estimators for random design regression
M. Dereziński, M. K. Warmuth, D. Hsu
Journal of Machine Learning Research 23(1):1−46 (2022)  PDF  arXiv

Domain Sparsification of Discrete Distributions using Entropic Independence
N. Anari, M. Dereziński, T.-D. Vuong, E. Yang
Innovations in Theoretical Computer Science (ITCS 2022)  arXiv

Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update
M. Dereziński, J. Lacotte, M. Pilanci, M. W. Mahoney
Conference on Neural Information Processing Systems (NeurIPS 2021) Spotlight Presentation  arXiv

Query Complexity of Least Absolute Deviation Regression via Robust Uniform Convergence
X. Chen, M. Dereziński
Conference on Learning Theory (COLT 2021)  arXiv

Sparse sketches with small inversion bias
M. Dereziński, Z. Liao, E. Dobriban, M. W. Mahoney
Conference on Learning Theory (COLT 2021)  arXiv

LocalNewton: Reducing Communication Bottleneck for Distributed Learning
V. Gupta, A. Ghosh, M. Dereziński, R. Khanna, K. Ramchandran, M. W. Mahoney
Conference on Uncertainty in Artificial Intelligence (UAI 2021)  arXiv

Determinantal Point Processes in Randomized Numerical Linear Algebra
M. Dereziński, M. W. Mahoney
Notices of the AMS 68(1) (2021)  arXiv

Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization
M. Dereziński, B. Bartan, M. Pilanci, M. W. Mahoney
Conference on Neural Information Processing Systems (NeurIPS 2020)  arXiv

Sampling from a k-DPP without looking at all items
D. Calandriello, M. Dereziński, M. Valko
Conference on Neural Information Processing Systems (NeurIPS 2020) Spotlight Presentation  arXiv

Precise expressions for random projections: Low-rank approximation and randomized Newton
M. Dereziński, F. Liang, Z. Liao, M. W. Mahoney
Conference on Neural Information Processing Systems (NeurIPS 2020)  arXiv

Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nyström method
M. Dereziński, R. Khanna, M. W. Mahoney
Conference on Neural Information Processing Systems (NeurIPS 2020) Best Paper Award  arXiv

Exact expressions for double descent and implicit regularization via surrogate random design
M. Dereziński, F. Liang, M. W. Mahoney
Conference on Neural Information Processing Systems (NeurIPS 2020)  arXiv

Isotropy and Log-Concave Polynomials: Accelerated Sampling and High-Precision Counting of Matroid Bases
N. Anari, M. Dereziński
Symposium on Foundations of Computer Science (FOCS 2020)  arXiv

Convergence Analysis of Block Coordinate Algorithms with Determinantal Sampling
M. Mutný, M. Dereziński, A. Krause
Conference on Artificial Intelligence and Statistics (AISTATS 2020)  arXiv

Bayesian experimental design using regularized determinantal point processes
M. Dereziński, F. Liang, M. W. Mahoney
Conference on Artificial Intelligence and Statistics (AISTATS 2020)  arXiv

Exact sampling of determinantal point processes with sublinear time preprocessing
M. Dereziński, D. Calandriello, M. Valko
Conference on Neural Information Processing Systems (NeurIPS 2019)  arXiv

Distributed estimation of the inverse Hessian by determinantal averaging
M. Dereziński, M. W. Mahoney
Conference on Neural Information Processing Systems (NeurIPS 2019)  arXiv

Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression
M. Dereziński, K. L. Clarkson, M. W. Mahoney, M. K. Warmuth
Conference on Learning Theory (COLT 2019)  arXiv

Fast determinantal point processes via distortion-free intermediate sampling
M. Dereziński
Conference on Learning Theory (COLT 2019)  arXiv

Correcting the bias in least squares regression with volume-rescaled sampling
M. Dereziński, M. K. Warmuth, D. Hsu
Conference on Artificial Intelligence and Statistics (AISTATS 2019)  arXiv

Leveraged volume sampling for linear regression
M. Dereziński, M. K. Warmuth, D. Hsu
Conference on Neural Information Processing Systems (NeurIPS 2018)  arXiv

Reverse iterative volume sampling for linear regression
M. Dereziński, M. K. Warmuth
Journal of Machine Learning Research 19(1):853-891 (2018)  arXiv

Subsampling for Ridge Regression via Regularized Volume Sampling
M. Dereziński, M. K. Warmuth
Conference on Artificial Intelligence and Statistics (AISTATS 2018)  arXiv  Poster  Talk

Batch-Expansion Training: An Efficient Optimization Framework
M. Dereziński, D. Mahajan, S. S. Keerthi, S.V.N. Vishwanathan, M. Weimer
Conference on Artificial Intelligence and Statistics (AISTATS 2018)  arXiv  Poster  Talk

Discovering Surprising Documents with Context-Aware Word Representations
M. Dereziński, K. Rohanimanesh, A. Hydrie
Conference on Intelligent User Interfaces (IUI 2018)  PDF  Talk

Unbiased estimates for linear regression via volume sampling
M. Dereziński, M. K. Warmuth
Conference on Neural Information Processing Systems (NeurIPS 2017)  arXiv  Poster  Spotlight

The limits of squared Euclidean distance regularization
M. Dereziński, M. K. Warmuth
Conference on Neural Information Processing Systems (NeurIPS 2014)  PDF  Spotlight