Abstract: Adaptive designs are effective mechanisms for flexibly allocating experimental resources. In clinical trials particularly, such designs allow researchers to balance short and long term goals of improving patient well being. This tradeoff often takes the form of trying to minimize patient losses during the experiment while simultaneously gathering sufficient information to support decisions about the treatment of future patients. Unfortunately, fully sequential designs require immediate responses since each allocation depends upon the responses of all previous assignments. Since optimal fully sequential strategies cannot be applied when responses are delayed, we seek optimal designs for models that specifically incorporate delays. Of interest are loss of efficiency compared to non-delayed situations and improvement over previously proposed but suboptimal delayed response designs. This appears to be the first work in which non-trivial delayed response models are fully optimized.
We assume two treatment options where responses are Bernoulli random variables. Although general objective functions can be handled, in our primary example we optimize patient successes during the experiment. Note that this problem reduces to a Bayesian 2-armed bandit problem when the delays are zero. Optimizing a 2-armed bandit that incorporates a delay structure can be solved via dynamic programming, but requires new algorithms.
Here we analyze a delay model in which patients arrive according to a Poisson process and their response times on each treatment are exponential. The results show that, except when the delay rate is several orders of magnitude different than the patient arrival rate, the delayed response bandit has surprisingly high efficiency compared to the optimal immediate response model. We also consider a randomized play-the-winner rule and find it to be far less efficient than the optimal delayed response bandit rule. We analyze the effects of various parallelization options, and while standard approaches do not work well, with effort an efficient, highly scalable program can be developed. This allows us to solve problems thousands of times more complex than those solved previously, which helps make adaptive designs practical. Further, our work applies to many other problems involving neighbor recurrences, such as generalized string matching.
Keywords: 2-arm bandit problem, delayed response, randomized play-the-winner, response adaptive procedure, adaptive allocation, sequential sampling procedure, controlled clinical trial, dynamic programming, design of experiments, parallel computing, supercomputing,
Complete paper. This paper appears in MODA6: Model-Oriented Data Analysis, A. Atkinson, P. Hackl, W. Mueller, eds., Physica Verlag, 2001, pp. 127-134.
|Copyright © 1996-2017 Quentin F. Stout.|