Data matrices are ubiquitous and in big messy data applications, they often have many missing entries. The problem of matrix completion asks, very generally, what kinds of assumptions might we make on that underlying matrix to successfully reconstruct the entire matrix? We have studied this problem under the low-rank model and the union of subspaces model. The majority of our work is given entry-wise measurements of the matrix, but we have also studied compressive measurements and monotonic non-linear measurements.

Many signal processing and computer vision applications need an estimate of a signal subspace from streaming high-dimensional observations. We have developed several Grassmannian incremental gradient descent algorithms and studied their convergence properties. We have also studied batch subspace estimation with heteroscedastic data.

Clustering is one of the most commonly used data exploration tools, but data often hold interesting geometric structure for which generic clustering objectives are too coarse. Subspace clustering a.k.a the union of subspaces model is a simple generalization that tries to fit each cluster with a low-dimensional subspace. Our group has developed state-of-the-art approaches for subspace clustering in the active clustering context and when the data matrix is incomplete or compressed.

Sensor data has been a focus of signal processing for decades. Signal processing for modern sensing can no longer rely on careful maintenance and calibration nor on expert-based outlier detection. We are developing machine learning algorithms to ensure that these data are still useful and enlightening, as well as the supporting theory to give guarantees that we may use data from large deployments with confidence.

In many applications, it’s possible to take measurements repeatedly to use for inference. Examples include genetic experiments, environmental sensing, and crowdsourced image tasks. The problem of how to design sequential measurements for machine learning inference is called active learning. We have studied active learning algorithms for image clustering/classification and spatial environmental sampling.

A great deal of work in sparse signal processing focuses on linear measurements. Applications, on the other hand, often have fundamental nonlinearities that need to be modeled. We have work studying algebraic variety models, monotonic nonlinear measurement functions, pairwise comparison or ranking data, and sparsity in deep networks.

SPADA lab works mostly on algorithm design and theoretical analysis, however, Professor Balzano’s favorite mathematical problems are motivated by fascinating and important problems in engineering and science. We have worked on applications in computer vision, environmental sensing, power systems, and computer network inference.