Ensemble K-Subspaces

Yesterday I gave a talk on Subspace Clustering using Ensemble methods at the Simons Institute. See the video here!

This is work with John Lipor, David Hong, and Yan Shuo Tan. Our related paper has been just updated on the arxiv. Our key observation was that, while K-Subspaces (KSS) works poorly overall and depends heavily on initialization, it still seems to give partially good clustering information. We therefore use it as a “weak clusterer” and combine ensembles of KSS (EKSS) by averaging the co-association/affinity matrices. This works extremely well, both in simulation and on real data, and also in theory. We were able to show that EKSS gives correct clustering in a variety of common cases: e.g. for subspaces with bounded affinity, and with noisy data and missing data. Our theory generalizes theory of the Thresholded Subspace Clustering algorithm to show that any algorithm that produces an affinity matrix that is an approximation to a monotonic function of absolute inner products will give correct clustering. This general theory should be broadly applicable to many geometric approaches to subspace clsutering.