SPADA lab at ICML in Vienna

I am excited to be a part of three papers at the International Conference of Machine Learning this July in Vienna.

Congratulations to Can Yaras for having his work on compression in deep low-rank learning, with co-authors Peng Wang and Qing Qu, accepted as an oral presentation for Tuesday afternoon! This work proves that when training deep linear networks, the gradient descent dynamics are limited to an invariant subspace. This subspace can be leveraged to make training and overparameterization more efficient, and allows us to reap the benefits of deep overparameterization without the computational burden. The code is available on Can’s github site. I talked about this work for the 1W-Minds seminar in April.

Peng Wang and Huikang Liu led our work on symmetric matrix completion with ReLU sampling that will be presented as a poster on Wednesday. We showed that it is possible to recover a low-rank matrix with sampling that is highly dependent on the matrix entries — we focus on ReLU sampling (and variants) where only positive entries are observed.

Finally, Wisconsin-Madison PhD student Yuchen Li will be presenting his work on block Riemannian MM methods, also with a poster on Wednesday. He proved iteration guarantees for convergence to a stationary point for general multi-block MM algorithms where any number of blocks may be constrained to a Riemannian manifold. His complexity results reduce to well-known results in the Euclidean case. This work is broadly applicable to alternating MM algorithms for machine learning problems.

SPADA Lab Research at AI Stats

Congratulations to Davoud Ataee Tarzanagh and Soo Min Kwon, whose research was presented in poster sessions at AI Stats this morning!

Davoud’s work on Online Bilevel Optimization was entirely conceived and driven by him during his postdoc at UM. The paper has novel definitions of bilevel dynamic regret, and he and Parvin proved many fabulous results for regret bounds for online alternating gradient descent in the strictly convex setting (with a matching lower bound) all the way to the nonconvex setting. He demonstrated its usefulness on online hyperparameter tuning, online loss tuning for imbalanced data, and then online meta learning with Bojian’s expertise! Online learning has provided a sea change for so much of ML on massive data, and we believe that OBO is a next crucial step for modern applications that commonly require careful balancing of objectives.

Soo Min and Tsinghua student Zekai Zhang’s work on Efficient Low-Dimensional Compression of Overparameterized Models demonstrates a method for compressing overparameterized deep linear layers in deep networks. Their approach gets consistently improved generalization error in a fraction of the computation time. The work shows that leveraging inherent low-dimensional structure within the model parameter updates, we can reap the benefits of overparameterization without the computational burden.

Research presentations at CPAL 2024

At the inaugural Conference on Parsimony and Learning (CPAL), my group is presenting three works that have come out of a recent exciting collaboration with UM Prof Qing Qu and other colleagues on low-rank learning in deep networks. Prof Qu’s prior work studying neural collapse in deep networks has opened many exciting directions for us to pursue! All three works study deep linear networks (DLNs), i.e. deep matrix factorization. In this setting (which is simplified from deep neural networks that have nonlinear activations), we can prove several interesting fundamental facts about the way DLNs learn from data when trained with gradient descent. Congratulations SPADA members Soo Min Kwon, Can Yaras, and Peng Wang (all co-advised by Prof Qu) for these publications!

Yaras, C., Wang, P., Hu, W., Zhu, Z., Balzano, L., & Qu, Q. (2023, December 1). Invariant Low-Dimensional Subspaces in Gradient Descent for Learning Deep Linear Networks. Conference on Parsimony and Learning (Recent Spotlight Track). https://openreview.net/forum?id=oSzCKf1I5N

Wang, P., Li, X., Yaras, C., Zhu, Z., Balzano, L., Hu, W., & Qu, Q. (2023, December 1). Understanding Hierarchical Representations in Deep Networks via Feature Compression and Discrimination. Conference on Parsimony and Learning (Recent Spotlight Track). https://openreview.net/forum?id=Ovuu8LpGZu

Kwon, S. M., Zhang, Z., Song, D., Balzano, L., & Qu, Q. (2023, December 1). Efficient Low-Dimensional Compression of Overparameterized Networks. Conference on Parsimony and Learning (Recent Spotlight Track). https://openreview.net/forum?id=1AVb9oEdK7

Congratulations Dr. Gilman and Dr. Du!

Last fall and winter, SPADA PhD students Kyle Gilman and Zhe Du graduated. Kyle’s thesis was titled “Scalable Algorithms Using Optimization on Orthogonal Matrix Manifolds,” and he continues to make fundamental contributions to interesting modern optimization problems. He is currently an Applied AI/ML Senior Associate at JPMorgan Chase. Zhe’s thesis was titled “Learning, Control, and Reduction for Markov Jump Systems,” with lots of interesting work at the intersection of machine learning and control. He is currently a Postdoctoral researcher working with Samet Oymak and Fabio Pasqualetti. I am excited to follow their work into the future as they make an impact in optimization, machine learning, and control!

K-Subspaces Algorithm Results at ICML

I’m excited that our results for the K-Subspaces algorithm were accepted to ICML. My postdoc Peng Wang will be presenting his excellent work; you may read the paper here or attend his session if you are interested. K-Subspaces (KSS) is a natural generalization of K-Means to higher dimensional centers, originally proposed by Bradley and Mangasarian in 2000. Peng not only showed that KSS converges locally, but that a simple spectral initialization guarantees a close-enough initialization in the case of data drawn randomly from arbitrary subspaces. This makes a giant step in a line of questioning that has been open for more than 20 years. Great work Peng!

HePPCAT in TSP

Our work on heteroscedastic PCA continues with our article “HePPCAT: Probabilistic PCA for Data with Heteroscedastic Noise,” published in IEEE Transactions on Signal Processing. In this paper we developed novel ascent algorithms to maximize the heteroscedastic PCA likelihood, simultaneously estimating the principal components and the heteroscedastic noise variances. We show a compelling application to air quality data, where it is common to have data both from sensors that are high-quality EPA instruments and others that are consumer grade. Code for the paper experiments is available at https://gitlab.com/heppcat-group, and the HePPCAT method is available as a registered Julia package. Congratulations to my student Kyle Gilman, former student David Hong, and colleague Jeff Fessler.

Congratulations Dr. Bower!

Last fall, my PhD student Amanda Bower defended her thesis titled “Dealing with Intransitivity, Non-Convexity, and Algorithmic Bias in Preference Learning.” Amanda was in the Applied Interdisciplinary Math program, co-advised by Martin Strauss. She will now be moving on to work with Twitter’s ML Ethics, Transparency, and Accountability (META) group. We are so proud that she is going to go make her mark on the world. Congratulations Dr. Bower!

Faktum är att Viagra på nätet börjar fungera efter 20-25 minuter för många patienter, vilket ger upp till 6 timmars prestanda från och med den tiden. Det mesta av Viagra som du kan köpa online är generiskt sildenafilcitrat och ofta är tabletterna märkta på detta sätt.

Preference Learning with Salient Features

I am excited that Amanda Bower will have the opportunity to discuss our new work in preference learning, “Preference Modeling with Context-Dependent Salient Features“, at ICML next week. In this work, we propose a new model for preference learning that takes into account the fact that when making pairwise comparisons, certain features may play an outside role in the comparison, making the pairwise comparison result inconsistent with a general preference order. We look forward to hearing people’s questions and feedback! Update post-conference: Her presentation can be viewed here.

Online Tensor Completion and Tracking

Kyle Gilman and I have a preprint out describing a new algorithm for online tensor completion and tracking. We derive and demonstrate an algorithm that operates on streaming tensor data, such as hyperspectral video collected over time, or chemo-sensing experiments in space and time. Kyle presented his work at the first fully virtual ICASSP, which you can view here. Anyone can register for free to this year’s virtual ICASSP and watch the videos, post questions, and join the discussion. Kyle’s code is also available here. We think this algorithm will have a major impact in speeding up low-rank tensor processing, especially with time-varying data, and we welcome questions and feedback.

The speed of the girl’s movement around the city is about four kilometers per hour, or two and a half, if she wears heels higher than six centimeters. The zone of possible contact is five meters free hookup, no more. That is, everything about everything you get … How much do you get? (Damn, they told me for a reason: study math properly, you will need it!) Well, like, five seconds.

Congratulations Dejiao and David!

SPADA lab is so proud of Dr. Dejiao Zhang and Dr. David Hong. They both successfully defended their PhD dissertations this spring. Dejiao is going to Amazon Web Services next, and David is going to a postdoc at the University of Pennsylvania. We expect you both to go off and do great things! Congratulations!