Strict L Isotonic Regression

Quentin F. Stout
Computer Science and Engineering, University of Michigan

 

Extended Abstract: This paper considers the problem of determining L isotonic regressions satisfying order constraints given by a partially ordered set specified via a DAG G = (V,E). Given a real-valued function f and a nonnegative-valued function w (the weights) on V, an isotonic regression of (f,w) is an order-preserving function on V which minimizes the weighted distance to f among all order-preserving functions. Unlike isotonic regressions for Lp for 1 < p < ∞, the L regression is not unique. An L regression of special interest is the strict L isotonic regression which is the limit, as p → ∞, of Lp isotonic regression. An algorithm for determining it is given, based on a new partial ordering of isotonic functions on V. The strict L isotonic regression is the unique minimum of this ordering.

Several algorithms for generating non-strict L isotonic regressions have previously appeared in the literature. However, they do not always generate the expected regressions, even for trivial DAGs. For example, if the DAG is {1,2,3} with the natural ordering, the function values are f(1)=1, f(2)=-1, and f(3)=0.5, and the weights are all 1, then any isotonic regression must have value 0 at 1 and 2. One would expect that the regression value generated by the algorithm at 3 would be 0.5, but no previously studied general-purpose algorithm for L results in this. However, strict isotonic regression does result in f(3)=0.5, as does Lp isotonic regression for any p < ∞. We examine the behavior of previous algorithms as mappings from weighted functions over a DAG G to isotonic functions over G, showing that the fastest algorithms are not monotonic mappings, and no previously studied algorithm preserves level set trimming (i.e., changing the function to be its regression value on a level set of the regression). In contrast, the strict L isotonic regression, and Lp regression for all 1 < p < ∞, is monotonic and preserves level set trimming.

Algorithms are given showing that if the partially ordered set has n vertices, then for linear or tree orderings a simplification of the pool adjacent violators (PAV) algorithm yields the strict isotonic regression in Θ(n log n) time. For arbitrary orderings it can be determined in time Θ(TC(G) + m + m* log m*), where TC(G) is the time to determine the transitive closure of G, m is the number of edges in the transitive closure, and m* is the number of violating pairs (i.e., a pair u, v where u precedes v but f(u) > f(v)).

Improvements:

Keywords: isotonic regression, nonparametric, shape constrained, mini-max error, L, uniform metric, Chebyshev metric, partial order, linear order, tree, "best best" L approximation, multivariate regression

Complete paper. This paper appears in Journal of Optimization Theory and Applications 152 (2012), pp. 121-135.

 

Related work: Here is an overview of my work on shape constrained regression (isotonic, unimodal, step), and here is information about the fastest (in O-notation) isotonic regression algorithms known to date, for a variety of orderings and Lp metrics.


Quentin's Home Copyright © 2010-2024 Quentin F. Stout.