### Isotonic Regression via Partitioning

Quentin F. Stout
Computer Science and Engineering, University of Michigan

Abstract: This paper gives algorithms for determining weighted isotonic regressions satisfying order constraints given by various ordered sets (directed acyclic graphs, aka DAGs). For the L1 metric a partitioning approach is used which exploits the fact that L1 regression values can always be chosen to be data values. Extending this approach, algorithms for binary-valued L1 isotonic regression are used to find Lp isotonic regressions for 1 < p < ∞. Algorithms are given for trees, 2-dimensional and multidimensional orderings, and arbitrary DAGs. Algorithms are also given for Lp isotonic regression with constrained data and weight values, L1 regression with unweighted data, and L1 regression for DAGs where there are multiple data values at the vertices.

Let n denote the number of vertices and m the number of edges. The algorithms show that one can determine the L1 regression for a

• line and rooted tree in Θ(n log n) time,
• star, and complete bipartite DAG, in Θ(n) time,
• 2-dimensional grid in Θ(n log n) time,
• 2-dimensional data in arbitrary locations in Θ(n log2 n) time,
• d-dimensional data in arbitrary locations, d ≥ 3, in Θ(n2 logd n) time,
• arbitrary DAG in Θ(n2.5 log n) time if the data is unweighted.
The ordering for d-dimensional data is the natural ordering, where < x1,...,xd> preceeds < y1,...,yd > if and only if xi ≤ yi for all 1 ≤ i ≤ d.

The paper's results for Lp apply for general p, but vary somewhat depending on whether p is 2, 3, 4, or 5, is an even integer, is an odd integer, or is nonintegral. See the paper for an explanation of the complete results and reasons for the differences. To illustrate, the times of the new algorithms for L2 are:

• 2-dimensional data in arbitrary locations: Θ(n2 log n),
• d-dimensional grid, d ≥ 3: Θ(n3 log n),
• d-dimensional data in arbitrary locations, d ≥ 3: Θ(n3 logd n),
• arbitrary DAG: Θ(n2m + n3 log n)
For general p there are algorithms which are semi-exact, meaning as exact as one can compute the root of a specific equation (this is exact for p = 2, 3, 4, and 5); accurate to within δ, meaning that the isotonic regression generated is pointwise within δ of the optimal isotonic regression; and exact, when the data has binary values and positive integer weights in a constrained range.

The algorithm for a DAG which is d-dimensional data in arbitrary locations, for d ≥ 3, is based on first creating a new DAG with more vertices but significantly fewer edges in the worst case, and then doing isotonic regression on the new DAG. The construction of the DAG is described in Isotonic Regression for Multiple Independent Variables.

Notes not in the paper:
• All of the results for approximation within δ for arbitrary p apply, without change, to arbitrary convex functions, assuming one can evaluate the functions in unit time. Further, they need not be multiples of the same function.
• The results for L1 are called exact because at each vertex one can always use one of the data values as the regression value. However, the weights may betray the exactness. For example, to find the median of values 1, 2, 3, 4, with weights 10-30, 1030, 1030, 10-29, the unique median is 3. However, because adding the first two, or last two, weights gives exactly the same answer even in double precision arithmetic, essentially any implementation would yield the interval [2,3] as the median.

Keywords: weighted isotonic median regression, nonparametric, shape constrained, L1, Lp, partial order, DAG, linear order, chain, tree, star, bivariate, multidimensional, classification, unweighted

Complete paper. This appears in Algorithmica 66 (2013), pp. 93-112.

Related work: Here is information about the best isotonic regression algorithms known to date, for a variety of orderings and metrics.