Richard Sutton and Satinder Singh (1994) "On Bias and Step Size in Temporal-Difference Learning", Proceedings of the Eighth Yale Workshop on Adaptive and Learning Systems, pp. 91-96, Yale University, New Haven, CT. (66K)