Abstract: Isotonic regression is a shape-constrained nonparametric regression in which the ordinate is a nondecreasing function of the abscissa. The regression outcome is an increasing step function. For an initial set of n points, the number of steps in the isotonic regression, m, may be as large as n. As a result, standard isotonic regression has been criticized as overfitting the data or making the representation too complicated. So-called "reduced" isotonic regression constrains the outcome to be a specified number of steps, b. However, because the fastest previous algorithm for determining an optimal reduced isotonic regression for the L_{2} metric takes Θ(n+ b m^{2}) time, reseachers felt the algorithms were too slow and instead used approximations. Other researchers had results that were approximations because they used a greedy top-down approach. Here we give an algorithm to find the exact solution in Θ(n + bm) time, and a simpler algorithm taking Θ(n + b m log m) time.
Our algorithms provide faster solutions for Fisher's "unrestricted maximum homogeneity" approximation and Ioannidis' optimal variable-width "serial histogram" problem (also known as "v-optimal histograms"). They also determine optimal k-means clustering of 1-dimensional data.
Keywords: reduced isotonic regression algorithm, v-optimal histogram, segmentation, monotonic, piecewise constant approximation, step function, k-means clustering, nonparametric regression
Complete paper. arXiv 1412.2844
Related work: This paper improves upon our results here. That paper also contains algorithms for L_{1}, while this paper only developes algorithms for L_{2}. Here is a paper on reduced isotonic regression and stepwise approximation for the L_{∞} metric. Here is information about the fastest isotonic regression algorithms, with no constraint on the number of steps, for a variety of orderings and metrics.
Copyright © 2014-2017 Quentin F. Stout |