Postdoc Opportunity at the University of Michigan

to begin in spring 2019.


Please email Laura Balzano <> with the subject “Joining the Balzano lab — postdoc 2019” if you are interested.

We are seeking a postdoc who is interested in applying machine learning techniques to real-time dynamic data analysis. While machine learning has advanced significantly over the last decade, its application to dynamic time-varying data is still in its infancy. This project will focus on three ML areas: online learning, stochastic gradient methods, and streaming PCA. We will work on theory to understand how the standard approaches behave when the data are time-varying, develop appropriate models for time-varying data, and develop novel approaches along with convergence theory. Our main applications focus will be power systems engineering and computer vision. In power systems, we will develop methodologies to infer the real-time behavior of aggregations of distributed energy resources from hierarchical, heterogeneous, and incomplete measurements of power system quantities. In computer vision, we will develop real-time algorithms for object tracking and activity recognition in video.

Optimally Weighted PCA for High-dimensional Heteroscedastic Data

Today I had the opportunity to speak about very recent results by my student David Hong (joint work also with Jeff Fessler) in analyzing asymptotic recovery guarantees for weighted PCA for high-dimensional heteroscedastic data. In the paper we recently posted online, we have asymptotic analysis (as both the number of samples and dimension of the problem grow to infinity, but converge to a fixed constant) of the recovery for weighted PCA components, amplitudes, and scores. Those recovery expressions allow us to find weights that give optimal recovery, and the weights turn out to be a very simple expression involving only the noise variance and the PCA amplitudes. To learn more, watch my talk here, and let us know if you have any questions!

AFOSR Young Investigator

I have great news that my AFOSR Young Investigator proposal was accepted for funding. My proposal was focused on time-varying low-rank factorization models, and various ways of solving a variety of related non-convex problem formulations.  Read more about it here.   I look forward to the contributions we will be able to make with the support of AFOSR.


Improving K-Subspaces via Coherence Pursuit

John Lipor, Andrew Gitlin, Bioashuai Tao, and I have a new paper, “Improving K-Subspaces via Coherence Pursuit,” to be published in the Journal of Special Topics in Signal Processing issue “Robust Subspace Learning and Tracking: Theory, Algorithms, and Applications.” In it we present a new subspace clustering algorithm, Coherence Pursuit – K-Subspaces (CoP-KSS). Here is the code for CoP-KSS and for our figures. Our paper considers specifically the PCA step in K-Subspaces, where a best-fit subspace estimate is determined from a (possibly incorrect) clustering. When a given cluster has points from multiple low-rank subspaces, PCA is not a robust approach. We replace that step with Coherence Pursuit, a new algorithm for Robust PCA. We prove that Coherence Pursuit indeed can recover the “majority” subspace when data from other low-rank subspaces is contaminating the cluster. In this paper we also prove — to the best of our knowledge, for the first time — that the K-Subspaces problem is NP-hard, and indeed even NP-hard to approximate within any finite factor for large enough subspace rank.


New paper in Journal of Multivariate Analysis

Congratulations to my student David Hong (and his co-advisor Jeff Fessler) for our published article in the Journal of Multivariate Analysis, titled “Asymptotic performance of PCA for high-dimensional heteroscedastic data.” Heteroscedastic data, where different data points are of differing quality (precisely, have different noise variance), are common in so many interesting big data problems. Sensor network data, medical imaging using historical data, and astronomical imaging are just a few examples. PCA is known to be the maximum likelihood estimate for data with additive Gaussian noise of a single variance across all the data points. This work investigates the performance of PCA when that homoscedastic noise assumption is violated. We give precise predictions for the recovery of subspaces and singular values in a spiked/planted model, and show that vanilla PCA (perhaps unsurprisingly) has suboptimal subspace recovery when the data are heteroscedastic. 

Publications Update

We have had many exciting publications in the last several months.

My student Dejiao Zhang and I worked with Mario Figueiredo and two other Michigan students on applying OWL regularization in deep networks. The intuition is that since OWL can tie correlated regressors, it should be able to do the same in deep nets that experience a high degree of co-adaptation (and correlation) of nodes in the network. Dejiao presented our paper Learning to Share: Simultaneous Parameter Tying and Sparsification for Deep Learning at ICLR last month and we will present Simultaneous Sparsity and Parameter Tying for Deep Learning using Ordered Weighted L1 Regularization at SSP next month.

With my colleague Johanna Mathieu and her student Greg Ledva, we published a paper in Transactions on Power Systems studying Real-Time Energy Disaggregation of a Distribution Feeder’s Demand Using Online Learning. The work leverages recent results in dynamic online learning where classes of dynamical models are used to apply online learning to the time-varying signal setting. This work can leverage existing sensing structure to improve prediction of distributed energy resources, demand-responsive electric loads and residential solar generation. We also have a book chapter in Energy Markets and Responsive Grids that was written also with my student Zhe Du.

Greg Ongie, David Hong, Dejiao Zhang, and I have been working on adaptive sampling for subspace estimation. If one has a matrix in memory that is large and difficult to access, but you want to compute a low-rank approximation of that matrix, one way is to sketch it by reading only parts of the matrix and computing an approximation. Our paper Enhanced Online Subspace Estimation Via Adaptive Sensing describes an adaptive sampling scheme to do exactly that, and using that scheme along with the GROUSE subspace estimation algorithm, we gave global convergence guarantees to the true underlying low-rank matrix. We will also present Online Estimation of Coherent Subspaces with Adaptive Sampling at SSP next month, which constrains the adaptive samples to be entry-wise and sees similar improvements.

Rounding it out, Zhe Du will be presenting our work with Necmiye Ozay on A Robust Algorithm for Online Switched System Identification at the SYS ID conference in July, and Bob Malinas and David Hong will present our work with Jeff Fessler on Learning Dictionary-Based Unions of Subspaces for Image Denoising at EUSIPCO in September. This spring Amanda Bower presented our work with Lalit Jain on The Landscape of Nonconvex Quadratic Feasibility, studying the minimizers for a non-convex formulation of the preference learning problem; and next week Naveen Murthy presents our work with Greg Ongie and Jeff Fessler on Memory-efficient Splitting Algorithms for Large-Scale Sparsity Regularized Optimization at the CT Meeting. Last fall Greg Ongie, Saket Dewangan, Jeff Fessler and I had a paper Online Dynamic MRI Reconstruction via Robust Subspace Tracking at GlobalSIP, pursuing the interesting idea of online subspace tracking for time-varying signals.

So many exciting research directions that we will continue to pursue!

Congratulations John!

Congratulations to Dr. John Lipor for successfully defending his PhD thesis in September! The title of his work is “Sensing Structured Signals with Active and Ensemble Methods.” In January he will start as Assistant Professor in the Portland State University ECE Department.

Congratulations David!

David Hong was awarded the Richard and Eleanor Towner Prize for Outstanding PhD Research at the Michigan Engineering Graduate Symposium. This is a prize awarded annually across the entire college of engineering to PhD students within about a year of graduation, and the criteria for selection are creativity, innovation, impact on society, and achievement. Congratulations David!

ICML Acceptances, SIAM OPT success

Congratulations to postdoc Greg Ongie, whose excellent work on Variety Models for Matrix Completion has been accepted to ICML. We’re very excited about the potential applications and open problems that we posed in this work. Congratulations also to John Lipor, whose work on Active Subspace Clustering has also been accepted to ICML — I’ve spoken about this work before at the Simons Institute workshop on Interactive Learning. It achieves state of the art clustering error on several benchmark datasets using very few pairwise cluster queries. (Edit: See here for Greg’s talk and here for John’s talk at ICML!)

We also just finished a week at the SIAM Optimization conference, where our mini-symposium on Non-convex Optimization in Data Analysis was a huge hit. We had a full room for each session and 12 outstanding talks. Thanks to my co-organizers Stephen Wright, Rebecca Willett, and Rob Nowak, and thanks to all the speakers and participants.

Student accomplishments

Congratulations to John Lipor for passing his proposal defense!

Congratulations to David Hong for winning both the session award for (SIC) Signal and Image Processing, Computer Vision at the University of Michigan Engineering Graduate Symposium and the award for Most Interesting Methodological Advancement in the Michigan MIDAS Symposium for his work on heteroscedastic PCA.

Congratulations to Chenlan Wang for winning the Rackham International Student Fellowship.

Nice work team!