Optimal Staged Adaptive Sampling Designs

Optimal Few-Stage Designs

Janis Hardwick Quentin F. Stout
University of Michigan

Abstract: We give algoirthms that create optimal designs for experiments in which sampling is carried out in stages. There are two Bernoulli populations and it is assumed that the outcomes of the previous stage are available before the sampling decision for the next stage is determined. At each stage, the design specifies the number of observations to be taken and the relative proportion to be sampled from each population. Of particular interest are 2-stage and 3-stage designs, which is why we call the work ``few-stage''.

The designs are strongly optimal in that, among all designs with the same sample size, priors, and number of stages, they optimize the expected value of the objective function. Most other designs are suboptimal because they require that the stage sizes always be the same no matter what observations have occured, or because they use approximations to determine the design's parameters.

To illustrate that the algorithms can be used for experiments of useful sample sizes, they are applied to estimation and optimization problems. Results show that for problems of moderate size, published asymptotic analyses do not always represent the true behavior of the optimal stage sizes, and efficiency may be lost if the analytical results are used instead of the true optimal allocation. For example, suppose an analytical analysis of a 2-stage design with n observations shows that the optimal first stage grows like O(√n). Does this mean it should grow like 0.5√n, or 10√n, or ...? Even if it is proven that asymptotically it should grow like 5√n, one might be confident in using a first stage of 5,000 when n=1,000,000, but is 50 the optimal first stage size when n=100?

In addition to finding optimal designs, our results also suggest that one might approach large problems by extrapolating optimal solutions for moderate sample sizes; and, that approaches of this sort could give design guidelines that are far more explicit (and hopefully closer to optimal) than those obtained through asymptotic analyses alone.

The exactly optimal few stage designs discussed here are generated computationally, and the examples presented indicate the ease with which this approach can be used to solve problems that present analytical difficulties. The algorithms are flexible and provide for the accurate representation of important characteristics of the problem. It is also shown that in various cases the base algorithms can be modified to incorporate simplifications. In such settings, significant speedups and space reductions can be obtained, permitting the exact solution of even larger problems.

With simple modifications the program can be used to optimize response adaptive designs where there is optional stopping, i.e., the sample size is not fixed. One of the examples optimizes a 2-arm clinical trial where there is a cost per patient in the trial, and where the number of future patients after the trial who will utilize the better therapy is a function of the estimated success rate of the therapy. Note that this has some similarities with the "Behavioral Bayes" analyses due to Gittins and Pezeshk.

Keywords: sequential analysis, controlled clinical trial, two-stage procedure, three-stage, design of experiments, group allocation, response-adaptive sampling design , bandit problem, dynamic programming, active learning

Complete paper This paper appears in Journal of Statistical Planning and Inference 104 (2002), pp. 121-145.
The algorithms were first published in 1995 in Hardwick, J and Stout, QF, "Determining optimal few stage allocation rules", Computing Science and Stat. 27 (1995), pp. 342-346.

Related Work

Seminar Presentation:: Optimal Few-Stage Designs for Clinical Trials.
Adaptive Allocation:: Here is an explanation of adaptive sampling, and here are our relevant publications.
Dynamic Programming (also known as backward induction):: Here is an overview of our work.