L0 Isotonic Regression with Secondary Objectives

Quentin F. Stout
Computer Science and Engineering, University of Michigan

 

Extended Abstract: We provide algorithms for isotonic regression on arbitrary dags minimizing L0 error (Hamming distance). This is also known as monotonic relabeling and is applicable when labels have a linear ordering but not necessarily a metric. There may be exponentially many optimal relabelings, so we look at secondary criteria to determine which are best. For simple ordinal labels the criterion is maximizing the number of labels which are only changed to an adjacent label (and recursively apply this). For real-valued labels we minimize the Lp error. For linearly ordered sets we also give algorithms which minimize the sum of the Lp and weighted L0 errors, a form of penalized (regularized) regression. We also examine L0 isotonic regression on multidimensional coordinate-wise orderings.

Basic L0 algorithms here and elsewhere are based on violator graphs and flow algorithms. Flow algorithms have been undergoing rapid improvement (in O-notation), which leads to faster regression algorithms. For linear orderings we use a different approach which is more direct, not constructing a violator graph. For multidimensional orderings we use the basic approach, but show how to rapidly construct a violator graph of size o(n2), which is in general smaller than the transitive reduction of the violator graph.

Keywords: L0 isotonic regression, monotonic, shape constrained nonparametric regression, linear order, multidimensional grid, product order, rendezvous graph, domination, violator graph

Complete paper. arXiv:2107.00251

Related work: Here is an overview of my work on shape constrained regression (isotonic, unimodal, step). The paper Lp Isotonic Regression using an L0 Approach is particularly relevant. Here is information about the fastest (in O-notation) isotonic regression algorithms known to date, for a variety of orderings and metrics.


Quentin's Home Copyright © 2024 Quentin F. Stout.