Optimal Reduced Isotonic Regression

Janis Hardwick     Quentin F. Stout
University of Michigan

 

Abstract: Isotonic regression is a shape-constrained nonparametric regression in which the ordinate is a nondecreasing function of the abscissa. The regression outcome is an increasing step function. For an initial set of n points, the number of steps in the isotonic regression, m, may be as large as n. As a result, standard isotonic regression has been criticized as overfitting the data or making the representation too complicated. So-called "reduced" isotonic regression constrains the outcome to be a specified number of steps, b. However, because the fastest previous algorithm for determining an optimal reduced isotonic regression for the L2 metric takes Θ(n+ b m2) time, reseachers felt the algorithms were too slow and instead used approximations. Other researchers had results that were approximations because they used a greedy top-down approach. Here we give an algorithm to find the exact solution in Θ(n + bm) time, and a simpler algorithm taking Θ(n + b m log m) time.

Our algorithms provide faster solutions for Fisher's "unrestricted maximum homogeneity" approximation and Ioannidis' optimal variable-width "serial histogram" problem (also known as "v-optimal histograms"). They also determine optimal k-means clustering of 1-dimensional data.

Keywords: reduced isotonic regression algorithm, v-optimal histogram, segmentation, monotonic, piecewise constant approximation, step function, k-means clustering, nonparametric regression

Complete paper. arXiv 1412.2844

Related work: This paper improves upon our results here. That paper also contains algorithms for L1, while this paper only developes algorithms for L2.

Here is an overview of my work on shape constrained regression (isotonic, unimodal, step), and here is information about the fastest (in O-notation) isotonic regression algorithms known to date, for a variety of orderings and metrics.


Quentin Stout homepage Copyright © 2014-2024 Quentin F. Stout