Quentin F. Stout

Computer Science and Engineering, University of Michigan

**Abstract:**
This paper gives algorithms for determining weighted isotonic regressions
satisfying order constraints given by various ordered sets (directed
acyclic graphs, aka DAGs).
For the L_{1} metric a partitioning approach is used which exploits
the fact that L_{1} regression values can always be chosen to be data values.
Extending this approach, algorithms for binary-valued L_{1} isotonic
regression are used to find L_{p} isotonic regressions for
1 < p < ∞.
Algorithms are given for trees, 2-dimensional and
multidimensional orderings, and arbitrary DAGs.
Algorithms are also given for L_{p} isotonic regression with
constrained data and
weight values, L_{1} regression with unweighted data,
and L_{1} regression
for DAGs where there are multiple data values at the vertices.

Let *n* denote the number of vertices and *m* the number of edges.
The algorithms show that one can determine the L_{1} regression for a

- line and rooted tree in Θ(n log n) time,
- star, and complete bipartite DAG, in Θ(n) time,
- 2-dimensional grid in Θ(n log n) time,
- 2-dimensional data in arbitrary locations in
Θ(n log
^{2}n) time, - d-dimensional data in arbitrary locations, d ≥ 3, in
Θ(n
^{2}log^{d}n) time, - arbitrary DAG in Θ(n
^{2.5}log n) time if the data is unweighted.

The paper's results for L_{p} apply for general p, but vary somewhat
depending on whether p is 2, 3, 4, or 5, is an even integer,
is an odd integer, or is nonintegral.
See the paper for an explanation of the complete results and reasons
for the differences.
To illustrate, the times of the new algorithms for L_{2} are:

- 2-dimensional data in arbitrary locations: Θ(n
^{2}log n), - d-dimensional grid, d ≥ 3: Θ(n
^{3}log n), - d-dimensional data in arbitrary locations, d ≥ 3:
Θ(n
^{3}log^{d}n), - arbitrary DAG: Θ(n
^{2}m + n^{3}log n)

The algorithm for a DAG which is d-dimensional data in arbitrary locations, for d ≥ 3, is based on first creating a new DAG with more vertices but significantly fewer edges in the worst case, and then doing isotonic regression on the new DAG. The construction of the DAG is described in Isotonic Regression for Multiple Independent Variables.

Notes not in the paper:- All of the results for approximation within δ for arbitrary p apply, without change, to arbitrary convex functions, assuming one can evaluate the functions in unit time. Further, they need not be multiples of the same function.
- The results for L
_{1}are called exact because at each vertex one can always use one of the data values as the regression value. However, the weights may betray the exactness. For example, to find the median of values 1, 2, 3, 4, with weights 10^{-30}, 10^{30}, 10^{30}, 10^{-29}, the unique median is 3. However, because adding the first two, or last two, weights gives exactly the same answer even in double precision arithmetic, essentially any implementation would yield the interval [2,3] as the median.

**Keywords:** weighted isotonic median regression, nonparametric,
shape constrained, L_{1}, L_{p},
partial order, DAG, linear order, chain, tree, star, bivariate,
multidimensional, classification, unweighted

**Complete paper**. This
appears in *Algorithmica* 66 (2013), pp. 93-112.

Here are papers on weighted
L_{∞} isotonic regression for arbitrary dags (also known as minimax regression),
strict L_{∞} isotonic
regression (this is the limit, as p → ∞, of L_{p}
isotonic regression),
L_{∞} isotonic regression for linear, multidimensional, and tree orders,
isotonic regression for
multiple independent variables,
isotonic regression with a fixed number of level sets, and unimodal regression.

Copyright © 2009-2017 Quentin F. Stout. |