Weighted L Isotonic Regression

Quentin F. Stout
Computer Science and Engineering, University of Michigan


Abstract: This paper gives algorithms for determining L isotonic regressions satisfying order constraints on a partially ordered set, specified via a directed acyclic graph (DAG). At each vertex there is a value and a nonnegative weight. We examine an algorithm's time and properties of the regressions it produces. The reason that the properties are examined is because the nonuniqueness of the isotonic regression can result in some undesirable behavior.

For a DAG with n vertices and m edges we give an algorithm taking Θ(m log n) time. However, this algorithm is not monotonic, in that there can be functions f and g where pointwise f ≥ g, but the isotonic regression produced for f is not pointwise ≥ the regression produced for g. Further, it is based on parametric search, which is infeasible.

An approach that is monotonic is to solve the prefix isotonic regression problem, where for every element x in the partial order, you determine the regression error for the isotonic regression on x and all of its predecessors. A prefix algorithm is give for linear and tree orders, taking Θ(n log n) time. It is also used to find a unimodal regression on a linear order, taking the same time. This completes the work in Unimodal Regression via Prefix Isotonic Regression by solving the remaining open case, namely L isotonic regression on weighted data. Note that this is more difficult than the unweighted data case.

Algorithms are also given to determine unimodal regression for an undirected tree, for isotonic regressions where the regression values must come from a specified set, and for isotonic regressions when the number of different weights, or number of different values, is much smaller than the number of vertices. An algorithm is also given for isotonic regression on a rooted tree, where the value of a parent must be the largest value of its children. We call this ``river isotonic regression'' since it is similar to the naming convention for rivers.

The fastest algorithm mentioned above can produce Max, the regression which is pointwise the largest of any L isotonic regression, and can produce Min, the pointwise minimum. In addition to the prefix algorithm, other regressions considered are ``basic'' and ``strict''. The former is the most commonly studied one, but people have rarely addressed the fact that it is nontrivial to compute for weighted data. The prefix approach simplifies this somewhat. The latter is discussed more thoroughly in the paper on strict L isotonic regression. Strict regression is the "best best" L isotonic regression. It is the limit, as p → ∞, of Lp isotonic regression. It has a strong property concerning the number of vertices with large regression errors (see the paper for a precise statement). The prefix approach has a somewhat weaker property in that it minimizes the number of vertices with maximal error. The basic approach also has this property. At the other end of the spectrum, at any vertex, one or both of Max and Min have the worst error among all isotonic regressions.

Keywords: isotonic regression, unimodal regression, monotonic, nonparametric, shape constrained, prefix isotonic regression, minimax error, L, uniform metric, Chebyshev metric, partial order, DAG, linear order, tree order

Complete paper. This is a major revision of an earlier version from 2008. The prior approach to multidimensional data, based on constructing a new DAG which has a more succinct representation of the ordering, has been moved to Isotonic Regression for Multiple Independent Variables, which also applies the approach to L1 and L2 metrics.

Related work: Here is information about the fastest isotonic regression algorithms known to date, for a variety of orderings and metrics.

Here are papers on L1 isotonic regression (also known as isotonic median regression), isotonic regression for multiple independent variables, isotonic regression with a fixed number of level sets, and unimodal regression. The paper on L1 isotonic regression also includes new algorithms for Lp isotonic regression, 1 < p < ∞. These are based on using a sequence of L1 binary-valued isotonic regressions. For L2 several of these improve upon prior results, which is somewhat surprising since it is the case that has been studied the most.

Quentin's Home Copyright © 2008-2017 Quentin F. Stout.