Algorithms for Response Adaptive Sampling Designs

Flexible Algorithms for Creating and Analyzing Adaptive Sampling Procedures

Janis Hardwick Quentin F. Stout
University of Michigan

Abstract: We describe a collection of algorithms and techniques that have been developed to aid in the design and analysis of adaptive allocation procedures for statistical experiments. An adaptive procedure is one in which the observations during the experiment can be used to adjust the ongoing experiment. These are also called response adaptive procedures, in contrast to procedures which make allocation decisions based on covariates. Typically, decisions such as how to sample during an experiment are made and fixed in advance. For example, in a classical controlled clinical trial, patients are allocated to one of two different treatment options with half being assigned to each therapy. At the end of the experiment a decision is made as to which treatment is more effective. In contrast, with an adaptive procedure, patient outcomes that have already been observed can be used to adjust the allocation of future patients. Properly designed, this allows one to improve expected patient outcomes as the experiment is running, while still being able to reach good statistical decisions at the end. Thus, adaptive procedures can offer significant ethical and cost advantages over standard fixed procedures. Adaptive procedures are of interest to statisticians, to people working in stochastic optimization, to control theorists, and to computer scientists. In machine learning this approach is known as ``active learning''.

Unfortunately, adaptive procedures are harder to design and analyze, which has severely limited their use. To help overcome this difficulty, we have developed a collection of programs and algorithmic techniques for the design and analysis of adaptive procedures. The emphasis is on providing flexibility to the investigator, so that appropriate statistical and practical concerns can be addressed directly. The techniques described allow for optimizations previously not attainable. They also permit exact evaluations for a wide range of criteria and are intended to encourage investigators to explore more alternatives. Optimizations investigated include 2- and 3-arm bandit models, staged allocation, and models with constrained switching between options. One of our algorithmic approaches, path induction, speeds up the process of evaluating a design multiple times so that thorough robustness studies can be undertaken. Our approaches can be utilized with both Bayesian and frequentist analyses, and the analysis approach can differ from the approach used in the design.

Keywords: response adaptive design, sequential sampling, stochastic optimization, design of experiments, controlled clinical trial, dynamic programming, bandit models, algorithms, staged allocation, path induction, switching constraints, robustness, machine learning, active learning, ``learn and confirm'', computational learning theory, parallel computing

Complete paper. This appears in New Developments and Applications in Experimental Design, N. Flournoy, W.F. Rosenberger, and W.K. Wong, eds., IMS Lecture Notes--Monograph Series vol. 34, 1998, pp. 91-105.

Related Work:

Adaptive Allocation:: An explanation of response-adaptive sampling, including a description of bandit problems , and our relevant publications.
Dynamic Programming (also known as Backward Induction):: An overview of our work on dynamic programming.