Papers in Reverse Chronological Order


Go back to publications main page.

Refereed Conference and Journal Papers

  1. Value Prediction Networks
    by Junhyuk Oh, Satinder Singh, Honglak Lee.
    In Neural Information Processing Systems (NIPS), 2017.
    arXiv.

  2. Repeated Inverse Reinforcement Learning
    by Kareem Amin, Nan Jiang, and Satinder Singh.
    In Neural Information Processing Systems (NIPS), 2017.
    arXiv.

  3. A Big Step for AI
    by Satinder Singh.
    In Nature: News & Views, 2017.
    pdf.

  4. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
    by Junhyuk Oh, Satinder Singh, Honglak Lee, and Pushmeet Kohli.
    In International Conference on Machine Learning (ICML), 2017.
    pdf.

  5. Learning to Query, Reason, and Answer Questions on Ambiguous Texts.
    by Xiaoxiao Guo, Tim Klinger, Clemens Rosenbaum, Jospeh Bigus, Murray Campbell, Ban Kawas, Kartik Talamadupula, Gerald Tesauro, and Satinder Singh.
    In 5th International Conference on Learning Representations (ICLR), 2017.
    pdf.

  6. Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes.
    by Shun Zhang, Edmund Durfee, and Satinder Singh.
    In 27th International Conference on Automated Planning and Scheduling (ICAPS), 2017.
    pdf.

  7. Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making.
    by Qi Zhang, Satinder Singh, and Edmund Durfee.
    In 27th International Conference on Automated Planning and Scheduling (ICAPS), 2017.
    pdf.

  8. Predicting Counselor Behaviors in Motivational Interviewing Encounters.
    by Veronica Perez-Rosas, Rada Mihalcea, Kenneth Resnicow, Satinder Singh, Lawrence An, Kathy J. Goggin, and Delwyn Catley.
    In Proceedings of the European Association of Computational Linguistics, (EACL) 2017.
    pdf.

  9. Control of Memory, Active Perception, and Action in Minecraft.
    by Junhyuk Oh, Valliappa Chockalingum, Satinder Singh, and Honglak Lee.
    In 33rd International Conference on Machine Learning (ICML), 2016.
    pdf.

  10. Gradient Methods for Stackelberg Security Games.
    by Kareem Amin, Satinder Singh, and Michael Wellman.
    In Conference on Uncertainty in Artificial Intelligence (UAI), 2016.
    pdf.

  11. Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games.
    by Xiaoxiao Guo, Satinder Singh, Richard Lewis, and Honglak Lee.
    In 25th International Joint Conference on Artificial Intelligence (IJCAI), 2016.
    pdf.

  12. Commitment Semantics for Sequential Decision Making Under Reward Uncertainty.
    by Qi Zhang, Edmund Durfee, Satinder Singh, Anna Chen, and Stefan Witwicki.
    In 25th International Joint Conference on Artificial Intelligence (IJCAI), 2016.
    pdf.

  13. On Structural Properties of MDPs that Bound Loss Due to Shallow Planning.
    by Nan Jiang, Satinder Singh and Ambuj Tewari.
    In 25th International Joint Conference on Artificial Intelligence (IJCAI), 2016.
    pdf.

  14. On the Trustworthy Fulfillment of Commitments.
    by Edmund Durfee and Satinder Singh.
    In Proceedings of the 18th International Workshop on Trust in Agent Societies (TRUST), 2016.
    pdf.

  15. Building a Motivational Interviewing Dataset.
    by Veronica Perez-Rosas, Rada Mihalcea, Kenneth Resnicow, Lawrence An, and Satinder Singh.
    In Proceedings of the NAACL 2016 Workshop on Clinical Psychology, 2016.
    pdf.

  16. Patient-Centerd Pain Care Using Artificial Intelligence and Mobile Health Tools: Protocol for a Randomized Study Funded by the US Department of Veterans Affairs Health Services Research and Development Program.
    by Piette JD, Krein SL, Striplin D, Marinec N, Kerns RD, Farris KB, Singh S, An L, and Heapy AA.
    In JMIR Research Protocols; 5(2) 2016.
    pdf.

  17. Improving Predictive State Representations via Gradient Descent.
    by Nan Jiang, Alex Kulesza, and Satinder Singh.
    In Thirtieth AAAI Conference on Artificial Intelligence (AAAI), 2016.
    pdf.

  18. Confirming the theoretical structure of expert-developed text messages to improve adherence to anti-hypertensive medications.
    by Karen Farris, Teresa Salgado, Peter Batra, John Piette, Satinder Singh, Ahmed Guhad, Sean Newman, Vincent Marshall, and Larry An.
    In Research in Social and Administrative Pharmacy, 2015.
    pdf.

  19. Action-Conditional Video Prediction Using Deep Networks in ATARI Games.
    by Juhnyuk Oh, Xiaoxiao Guo, Honglak Lee, Richard Lewis, and Satinder Singh.
    In Neural Information Processing Systems, 2015.
    online videos
    arxiv pdf, NIPS pdf, NIPS Appendix pdf.

  20. Multi-Task Seizure Detection: Addressing Inter-Patient and Intra-Patient Variations in Seizure Morphologies.
    by Alex Van Esbroeck, Landon Smith, Zeeshan Syed, Satinder Singh, and Zahi Karam.
    In Machine Learning, 2015.
    pdf.

  21. Abstraction Selection in Model-Based Reinforcement Learning.
    by Nan Jiang, Alex Kulesza, and Satinder Singh.
    In 32nd International Conference on Machine Learning (ICML), 2015.
    pdf.

  22. The Dependence of Effective Planning Horizon on Model Accuracy.
    by Nan Jiang, Alex Kulesza, Satinder Singh, and Richard Lewis.
    In International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 2015.
    Best Paper Award
    pdf.

  23. Low-Rank Spectral Learning with Weighted Loss Functions.
    by Alex Kulesza, Nan Jiang, and Satinder Singh.
    In Eighteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2015.
    pdf.

  24. Spectral Learning of Predictive State Representations with Insufficient Statistics.
    by Alex Kulesza, Nan Jiang, and Satinder Singh.
    In Twenty-Ninth AAAI Conference, 2015.
    pdf.

  25. Optimal Rewards for Cooperative Agents.
    by Bingyao Liu, Satinder Singh, Richard Lewis, and Shiyin Qin.
    In IEEE Transactions on Autonomous Mental Development, Vol 6, Issue 4, 2014.
    pdf.

  26. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning.
    by Xiaoxiao Guo, Satinder Singh, Honglak Lee, Richard Lewis, and Xiaoshi Wang.
    In Neural Information Processing Systems (NIPS), 2014.
    pdf.

  27. Computationally Rational Saccadic Control: An Explanation of Spillover Effects Based on Sampling from Noisy Perception and Memory.
    by Michael Shvartsman, Richard L Lewis, and Satinder Singh.
    In Cognitive Modeling and Computational Linguistics (CMCL), 2014.
    pdf.

  28. The Potential Impact of Intelligent Systems for Mobile Health Self-Management Support: Monte-Carlo Simulations of Text Message Support for Medication Adherence.
    by John Piette, Karen Farris, Sean Newman, Larry An, Jeremy Sussman, and Satinder Singh.
    In Annals of Behavioral Medicine, 2014.
    pdf.

  29. Low-Rank Spectral Learning.
    by Alex Kulesza, Nadakuditi Raj Rao, and Satinder Singh.
    In Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2014.
    pdf.

  30. Evaluating Trauma Patients: Addressing Missing Covariates with Joint Optimization.
    by Alex Van Esbroeck, Satinder Singh, Ilan Rubinfeld, and Zeeshan Syed.
    In 28th AAAI Conference on Artificial Intelligence (AAAI-14), 2014.
    pdf.

  31. Predicting Postoperative Atrial Fibrillation from Independent ECG Components.
    by Chih-Chun Chia, James Blum, Zahi Karam, Satinder Singh, and Zeeshan Syed.
    In 28th AAAI Conference on Artificial Intelligence (AAAI-14), 2014.
    pdf.

  32. Ecologically Valid Long-Term Mood Monitoring of Individuals with Bipolar Disorder Using Speech.
    by Zahi Karam, Emily Mower Provost, Satinder Singh, Jennifer Montgomery, Christopher Archer, Gloria Harrington, and Melvin Mcinnis.
    In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014.
    pdf.

  33. Characterizing EVOI-Sufficient k-Response-Query Sets in Decision Problems.
    by Robert Cohn, Satinder Singh, and Edmund Durfee.
    In Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2014.
    pdf.

  34. Improving UCT Planning via Approximate Homomorphisms.
    by Nan Jiang, Satinder Singh, and Richard Lewis.
    In 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2014.
    pdf.

  35. Utility Maximization and Bounds on Human Information Processing.
    by Andrew Howes, Richard L Lewis, and Satinder Singh.
    In Topics in Cognitive Science, Volume 6, Issue 2, pages 198-203, 2014.
    pdf.

  36. Computing Solutions in infinite-horizon discounted adversarial patrolling games.
    by Yevgeniy Vorobeychik, Bo An, Milind Tambe, and Satinder Singh.
    In 24th International Conference on Automated Planning and Scheduling (ICAPS), 2014.
    pdf.

  37. Computational Rationality: Linking Mechanism and Behavior Through Utility Maximization.
    by Richard L Lewis, Andrew Howes, and Satinder Singh.
    In Topics in Cognitive Science, Volume 6, Issue 2, pages 279-311, 2014.
    pdf.

  38. Reward Mapping for Transfer in Long-Lived Agents.
    by Xiaoxiao Guo, Satinder Singh, and Richard L Lewis.
    In Advances in Neural Information Processing Systems (NIPS), 26, 2013.
    pdf.

  39. The adaptive nature of eye-movements in linguistic tasks: How payoff and architecture shape speed-accuracy tradeoffs.
    by Richard L Lewis, Michael Shvartsman, and Satinder Singh.
    In Topics in Cognitive Science, Vol. 5, Issue 3, pages 581-610, 2013.
    pdf.

  40. Linking Context to Evaluation in the Design of Safety Critical Interfaces.
    by Michael Feary, Dorritt Billman, Xiuli Chen, Andrew Howes, Richard Lewis, Lance Sherry, and Satinder Singh.
    In Proceedings of Human-Computer Interaction International, 2013.
    pdf.

  41. An Exploration of Low-Rank Spectral Learning.
    by Alex Kulesza, Nadakuditi Raj Rao, and Satinder Singh.
    In ICML Workshop on Spectral Learning, 2013.
    pdf.

  42. Maximizing the Value of Mobile Health Monitoring by Avoiding Redundant Patient Records: Prediction of Depression-Related Symptoms and Adherence Problems in Automated Health Assessment Services.
    by John Piette, Jeremy Sussman, Paul Pfeiffer, Maria Silveira, Satinder Singh, and Mariel Lavieri.
    In Journal of Medical Internet Research, Vol 15, No. 7, 2013.

  43. Testing the Structure of SMS Messages for use in an Artificial Intelligence (AI)-driven SMS Antihypertensive Adherence Support Tool.
    by Karen Farris, Sean Newman, Satinder Singh, Larry An, and John Piette.
    Research Abstract in Wireless Health, 2013.
    pdf.

  44. Optimal Rewards in Multiagent Teams
    by Bingyao Liu, Satinder Singh, Richard L. Lewis, and Syiyin Qin
    In International Conference on Development and Learning-EpiRob, 2012.
    pdf.

  45. Lossy Stochastic Game Abstraction with Bounds
    by Tuomas Sandholm and Satinder Singh.
    In Proceedings of the 13th ACM Conference on Electronic Commerce (EC), 2012.
    pdf.
    A previous version appears in Fifth International Workshop on Optimization in Multi-Agent Systems (OPTMAS), 2012.

  46. Learning and Predicting Dynamic Networked Behavior with Graphical Multiagent Models
    by Quang Duong, Michael P. Wellman, Satinder Singh, and Michael Kearns.
    In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2012.
    pdf.

  47. Strong Mitigation: Nesting Search for Good Policies within Search for Good Reward
    by Jeshua Bratman, Satinder Singh, Richard Lewis, and Jonathan Sorg.
    In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2012.
    pdf.

  48. Security Games with Limited Surveillance
    by Bo An, David Kempe, Christopher Kiekintveld, Eric Shieh, Satinder Singh, Milind Tambe, and Yevgeniy Vorobeychik.
    In Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (AAAI), 2012.
    pdf.

  49. Computing Stackelberg Equilibria in Discounted Stochastic Games
    by Yevgeniy Vorobeychik and Satinder Singh.
    In Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (AAAI), 2012.
    pdf.
    (This is a corrected version of the paper that appeared in the conference proceedings.
    Major thanks to Vincent Conitzer for finding a counterexample to the main theorem in the now-corrected submitted version.)

  50. Planning Delayed-Response Queries and Transient Policies under Reward Uncertainty
    by Rob Cohn, Edmund Durfee and Satinder Singh.
    In Proceedings of the Seventh Annual Workshop on Multiagent Sequential Decision-Making Under Uncertainty (MSDM), held in conjunction with AAMAS, 2012.
    pdf.

  51. Planning and Evaluating Multiagent Influences Under Reward Uncertainty (Extended Abstract)
    by Stefan Witwicki, Inn-Tung Chen, Edmund Durfee and Satinder Singh.
    In 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2012.
    pdf.

  52. Learning to Make Predictions in Partially Observable Environments without a Generative Model
    by Erik Talvitie and Satinder Singh.
    In Journal of Artificial Intelligence Research, vol 42, pages 353-392, 2011.
    pdf.

  53. Optimal Rewards versus Leaf-Evaluation Heuristics in Planning Agents
    by Jonathan Sorg, Satinder Singh, and Richard Lewis.
    In Proceedings of the Twenty-Fifth Conference on Artificial Intelligence (AAAI), 2011.
    pdf.

  54. Comparing Action-Query Strategies in Semi-Autonomous Agents
    by Robert Cohn, Edmund Durfee, and Satinder Singh.
    In Proceedings of the Twenty-Fifth Conference on Artificial Intelligence (AAAI), 2011.
    pdf.
    An extended abstract also appears in the Proceedings of the 10th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 2011.

  55. Learning and Predicting Dynamic Behavior with Graphical Multiagent Models
    by Quang Duong, Michael P. Wellman, Satinder Singh, and Michael Kearns.
    In 5th International Workshop on Social Networks Mining and Analysis at KDD (SNACKDD-11), 2011.
    pdf.
    An extended abstract also appears in the Proceedings of the 2nd Workshop on Information in Networks (WIN-10), 2010.

  56. Modeling Information Diffusion in Networks with Unobserved Links
    by Quang Duong, Michael P. Wellman, and Satinder Singh.
    In 3rd IEEE Conference on Social Computing (SocialCom-11), 2011.
    pdf.
    An earlier version also appears in the 5th International Workshop on Social Networks Mining and Analysis at KDD (SNACKDD-11), 2011

  57. Dynamic Incentive Mechanisms
    by David C. Parkes, Ruggiero Cavallo, Florin Constantin and Satinder Singh.
    In AI Magazine, Vol. 31, No. 4, pages 79-94, 2010.
    pdf.

  58. Reward Design via Online Gradient Ascent
    by Jonathan Sorg, Satinder Singh, and Richard Lewis.
    In Neural Information Processing Systems (NIPS), 2010.
    pdf.

  59. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective
    by Satinder Singh, Richard Lewis, Andrew Barto, and Jonathan Sorg.
    In IEEE Transactions on Autonomous Mental Development, Vol 2, No 2, 2010.
    pdf

  60. Modeling Multiple-mode Systems with Predictive State Representations
    by Britton Wolfe, Michael James and Satinder Singh.
    In Proceedings of the 13th International IEEE Conference on Intelligent Transportation Systems, 2010.
    pdf

  61. Variance-Based Rewards for Approximate Bayesian Reinforcement Learning
    by Jonathan Sorg, Satinder Singh, and Richard Lewis.
    In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI), 2010.
    pdf

  62. Internal Rewards Mitigate Agent Boundedness
    by Jonathan Sorg, Satinder Singh, and Richard Lewis.
    In Proceedings of the 27th International Conference on Machine Learning (ICML), 2010.
    pdf

  63. A New Approach to Exploring Language Emergence as Boundedly Optimal Control in the Face of Environmental and Cognitive Constraints
    by Jeshua Bratman, Michael Schvartsman, Richard Lewis, and Satinder Singh.
    In Proceedings of the 10th International Conference on Cognitive Modeling (ICCM), 2010.
    (Honorable mention for Allan Newell Best Student Paper Award at ICCM)
    pdf

  64. Selecting Operator Queries Using Expected Myopic Gain
    by Robert Cohn, Michael Maxim, Edmund Durfee, and Satinder Singh.
    In Proceedings of the International Conference on Intelligent Agent Technology (IAT), 2010.
    pdf

  65. History-Dependent Graphical Multiagent Models
    by Quang Duong, Michael Wellman, Satinder Singh, and Yevgeniy Vorobeychik.
    In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2010.
    pdf

  66. Linear Options
    by Jonathan Sorg and Satinder Singh.
    In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2010.
    (Finalist for Pragnesh Jay Modi Best Student Paper Award)
    pdf

  67. Transfer via Soft Homomorphisms
    by Jonathan Sorg and Satinder Singh.
    In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2009.
    pdf

  68. SarsaLandmark: an Algorithm for Learning in POMDPs with Landmarks
    by Michael R. James and Satinder Singh.
    In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2009.
    pdf

  69. Learning Graphical Game Models
    by Quang Duong, Yevgeniy Vorobeychik, Satinder Singh and Michael Wellman.
    In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI), 2009.
    pdf

  70. Where Do Rewards Come From?
    by Satinder Singh, Richard L. Lewis and Andrew G. Barto.
    In Proceedings of the Annual Conference of the Cognitive Science Society (CogSci), 2009.
    pdf

  71. Maintaining Predictions Over Time Without a Model
    by Erik Talvitie and Satinder Singh.
    In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI), 2009.
    pdf

  72. Simple Local Models for Complex Dynamical Systems
    by Erik Talvitie and Satinder Singh.
    In Proceedings of the 22nd Annual Conference on Neural Information Processing Systems (NIPS), 2008.
    pdf

  73. Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State
    by David Wingate and Satinder Singh.
    In Proceedings of the 25th International Conference on Machine Learning (ICML), pages 1176-1183, 2008.
    pdf

  74. Building Incomplete but Accurate Models
    by Erik Talvitie, Britton Wolfe and Satinder Singh.
    In Proceedings of the Tenth International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2008.
    pdf

  75. Predictive Linear-Gaussian Models of Stochastic Dynamical Systems with Vector-Value Actions and Observations
    by Matthew Rudary and Satinder Singh.
    In Proceedings of the Tenth International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2008.
    pdf

  76. Knowledge Combination in Graphical Multiagent Models
    by Quang Duong, Michael Wellman and Satinder Singh.
    In Proceedings of the 24th Annual Conference on Uncertainty in Artificial Intelligence (UAI), 2008.
    pdf

  77. Approximate Predictive State Representations
    by Britton Wolfe, Michael R. James and Satinder Singh.
    In Procedings of the 2008 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2008.
    (Finalist for Pragnesh Jay Modi Best Student Paper Award)
    pdf

  78. Learning Payoff Functions in Infinite Games
    by Yevgeniy Vorobeychik, Michael Wellman and Satinder Singh.
    Machine Learning Journal 67:145-168, 2007.
    pdf

  79. Constraint Satisfaction Algorithms for Graphical Games
    by Vishal Soni, Satinder Singh and Michael Wellman.
    In Procedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2007.
    pdf

  80. On Discovery and Learning of Models with Predictive State Representations of State for Agents with Continuous Actions and Observations
    by David Wingate and Satinder Singh.
    In Procedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2007.
    pdf

  81. Relational Knowledge with Predictive State Representations
    by David Wingate, Vishal Soni, Britton Wolfe and Satinder Singh.
    In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pages 2035-2040, 2007.
    pdf

  82. An Experts Algorithm for Transfer Learning
    by Erik Talvitie and Satinder Singh.
    In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pages 1065-1070, 2007.
    pdf

  83. Abstraction in Predictive State Representations
    by Vishal Soni and Satinder Singh.
    In Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI), 2007.
    pdf

  84. Exponential Family Predictive Representations of State
    by David Wnigate and Satinder Singh.
    In Proceedings of the Advances in Neural Information Processing Systems, 20 (NIPS), pages 1617-1624, 2007.
    pdf

  85. Cobot in LambdaMOO: An Adaptive Social Statistics Agent
    by Charles Isbell, Michael Kearns, Satinder Singh, Christian Shelton, Peter Stone and Dave Kormann.
    In Journal of Autonomous Agents and Multi-Agent Systems, 13(3), pages 327-354, 2006.
    pdf

  86. Mixtures of Predictive Linear Gaussian Models for Nonlinear Stochastic Dynamical Systems
    by David Wingate and Satinder Singh.
    In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), 2006.
    pdf

  87. Using Homomorphisms to Transfer Options Across Reinforcement Learning Domains
    by Vishal Soni and Satinder Singh.
    In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), 2006.
    pdf

  88. Kernel Predictive Linear-Gaussian Models for Nonlinear Stochastic Dynamical Systems
    by David Wingate and Satinder Singh.
    In Proceedings of the 23rd International Conference on Machine Learning (ICML), pages 1017-1024, 2006.
    pdf

  89. Predictive linear-Gaussian models of controlled stochastic dynamical systems
    by Matthew Rudary and Satinder Singh.
    In Proceedings of the 23rd International Conference on Machine Learning (ICML), pages 777-784, 2006.
    pdf

  90. Predictive State Representations with Options
    by Britton Wolfe and Satinder Singh.
    In Proceedings of the 23rd International Conference on Machine Learning (ICML), pages 1025-1032, 2006.
    pdf

  91. Empirical Game-Theoretic Analysis of Chaturanga
    by Christopher Kiekintveld, Michael Wellman and Satinder Singh.
    In Proceedings of AAMAS-06 Workshop on Game-Theoretic and Decision-Theoretic Agents, 2006.
    pdf

  92. Optimal Coordinated Planning Amongst Self-Interested Agents with Private State
    by Ruggiero Cavallo, David C. Parkes and Satinder Singh.
    In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI), 2006.
    pdf

  93. Optimal Coordination of Loosely-Coupled Self-InterestedRobots
    by Ruggeiro Cavallo, David C. Parkes, and Satinder Singh.
    In Workshop on Auction Mechanisms for Robot Coordination at AAAI'06, 2006.
    pdf

  94. Reinforcement Learning of Hierarchical Skills on the Sony Aibo Robot
    by Vishal Soni and Satinder Singh.
    In Proceedings of the 5th International Conference on Development and Learning (ICDL), 2006.
    pdf

  95. Off-policy Learning with Options and Recognizers
    by Doina Precup, Richard Sutton, Cosmin Paduraru, Anna Koop and Satinder Singh.
    In Proceedings of Advances in Neural Information Processing Systems 18 (NIPS), pages 1097-1104, 2006.
    pdf

  96. Intrinsically Motivated Reinforcement Learning
    by Satinder Singh, Andrew G. Barto and Nuttapong Chentanez.
    In Proceedings of Advances in Neural Information Processing Systems 17 (NIPS), pages 1281-1288, 2005.
    pdf

  97. Approximately Efficient Online Mechanism Design
    by David Parkes, Satinder Singh and Dimah Yanovsky.
    In Proceedings of Advances in Neural Information Processing Systems 17 (NIPS), pages 1049-1056, 2005.
    pdf

  98. Predictive linear-Gaussian models of stochastic dynamical systems
    by Matthew Rudary, Satinder Singh and David Wingate.
    In Proceedings of the Uncertainty in Artificial Intelligence (UAI), pages 501-508, 2005.
    pdf

  99. Combining Memory and Landmarks with Predictive State Representations
    by Michael R. James, Britton Wolfe and Satinder Singh.
    In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), 2005.
    pdf

  100. Learning Payoff Functions in Infinite Games
    by Yevgeniy Vorobeychik, Michael Wellman and Satinder Singh.
    In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), 2005
    pdf
    (An expanded version was later published in the Machine Learning Journal; pdf)

  101. Planning in Models that Combine Memory with Predictive Representations of State
    by Michael R. James and Satinder Singh.
    In Proceedings of the 20th National Conference on Artificial Intelligence (AAAI), pages 987-992, 2005.
    pdf

  102. Learning Predictive State Representations in Dynamical Systems Without Reset
    by Britton Wolfe, Michael R. James and Satinder Singh.
    In Proceedings of the 22nd International Conference on Machine Learning (ICML), 2005.
    pdf

  103. Intrinsically Motivated Learning of Hierarchical Collections of Skills
    by Andrew G. Barto, Satinder Singh, and Nuttapong Chentanez.
    In Proceedings of International Conference on Developmental Learning (ICDL), 2004.
    pdf

  104. Predictive State Representations: A New Theory for Modeling Dynamical Systems
    by Satinder Singh, Michael R. James and Matthew R. Rudary.
    In Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI), pages 512-519, 2004.
    pdf

  105. Learning and Discovery of Predictive State Representations in Dynamical Systems with Reset
    by Michael James and Satinder Singh.
    In Proceedings of the Twenty-First International Conference on Machine Learning (ICML), pages 417-424, 2004.
    pdf

  106. Adaptive Cognitive Orthotics: Combining Reinforcement Learning and Constraint-Based Temporal Reasoning
    by Matthew Rudary, Satinder Singh and Martha Pollack.
    In Proceedings of the Twenty-First International Conference on Machine Learning (ICML), pages 719-726, 2004.
    pdf

  107. Planning with Predictive State Representations
    by Michael R. James, Satinder Singh and Michael Littman.
    In Proceedings of the International Conference on Machine Learning and Applications (ICMLA), pages 304-311, 2004.
    pdf

  108. Computing Approximate Bayes Nash Equilibria in Tree-Games of Incomplete Information
    by Satinder Singh, Vishal Soni and Michael Wellman.
    In Proceedings of the Fifth ACM Conference on Electronic Commerce (EC), pages 81-90, 2004.
    pdf

  109. Distributed Feedback Control for Decision Making on Supply Chains
    by Christopher Kiekintveld, Michael P. Wellman, Satinder Singh, Joshua Estelle, Yevgeniy Vorobeychik, Vishal Soni and Matthew Rudary.
    In Proceedings of the 14th International Conference on Automated Planning and Scheduling (ICAPS), pages 384-392, 2004.
    pdf.

  110. Strategic Interactions in the TAC 2003 Supply Chain Tournament
    by Joshua Estelle, Yevgeniy Vorobeychik, Michael P. Wellman, Satinder Singh, Christopher Kiekintveld and Vishal Soni.
    In Proceedings of the Fourth International Conference on Computer & Games, 2004.
    pdf

  111. A Nonlinear Predictive State Representation
    by Matthew Rudary and Satinder Singh.
    In Advances in Neural Information Processing Systems 16 (NIPS), pages 855-862, 2004.
    pdf

  112. Strategic Procurement in TAC/SCM: An Empirical Game-Theoretic Analysis
    by Joshua Estelle, Yevgeniy Vorobeychik, Michael P. Wellman, Satinder Singh, Christopher Kiekintveld, and Vishal Soni.
    In Workshop on Trading Agent Design and Analysis (TADA), 2004.
    pdf

  113. An MDP-Based Approach to Online Mechanism Design
    by David Parkes and Satinder Singh.
    In Advances in Neural Information Processing Systems 16 (NIPS), pages 791-798, 2004.
    pdf

  114. Learning Predictive State Representations
    by Satinder Singh, Michael Littman, Nicholas Jong, David Pardoe and Peter Stone.
    In Proceedings of the Twentieth International Conference on Machine Learning (ICML), pages 712-719, 2003.
    pdf

  115. Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System
    by Satinder Singh, Diane Litman, Michael Kearns and Marilyn Walker.
    In Journal of Artificial Intelligence Research (JAIR), Volume 16, pages 105-133, 2002.
    pdf

  116. CobotDS: A Spoken Dialogue System for Chat
    by Michael Kearns, Charles Isbell, Satinder Singh, Diane Litman, and J. Howe.
    In Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI), pages 435-430, 2002.
    pdf

  117. Near-Optimal Reinforcement Learning in Polynomial Time
    by Michael Kearns and Satinder Singh.
    In Machine Learning journal, Volume 49, Issue 2, pages 209-232, 2002.
    ( shorter version appears in ICML 1998).
    pdf

  118. Predictive Representations of State
    by Michael Littman, Richard Sutton and Satinder Singh.
    In Advances in Neural Information Processing Systems 14 (NIPS), pages 1555-1561, 2002.
    pdf

  119. ATTac-2000: An Adaptive Autonomous Bidding Agent
    by Peter Stone, Michael Littman, Satinder Singh and Michael Kearns.
    In Journal of Artificial Intelligence Research (JAIR), Vol 15, pages 189-206, 2001.
    pdf.
    (A shorter version also appears in AAAI'01 as listed below)

  120. Graphical Models for Game Theory
    by Michael Kearns, Michael Littman and Satinder Singh.
    In Proceedings of the Seventeenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 253-260, 2001.
    pdf

  121. An Efficient Exact Algorithm for Single Connected Graphical Games
    by Michael Littman, Michael Kearns and Satinder Singh.
    In Advances in Neural Information Processing Systems 14 (NIPS), pages 817-823, 2002.
    pdf

  122. FAucs: An FCC Spectrum Auction Simulator for Autonomous Bidding Agents
    by Janos Csirik, Michael Littman, Satinder Singh and Peter Stone.
    In Electronic Commerce: Proceedings of the Second Interanational Workshop 2001.
    pdf

  123. ATTac-2000: An Adaptive Autonomous Bidding Agent
    by Peter Stone, Michael Littman, Satinder Singh and Michael Kearns.
    In Proceedings of the Fifth International Conference on Autonomous Agents (AGENTS), pages 238-245, 2001.
    pdf

  124. Cobot: A Social Reinforcement Learning Agent
    by Charles Isbell, Christian Shelton, Michael Kearns, Satinder Singh and Peter Stone.
    In Advances in Neural Information Processing Systems 14 (NIPS) pages 1393-1400, 2002.
    pdf

  125. A Social Reinforcement Learning Agent
    by Charles Isbell, Christian Shelton, Michael Kearns, Satinder Singh and Peter Stone.
    In Proceedings of the Fifth International Conference on Autonomous Agents (AGENTS), pages 377-384, 2001.
    Winner of Best Paper Award.
    pdf

  126. Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System
    by Satinder Singh, Michael Kearns, Diane Litman, and Marilyn Walker.
    In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI), pages 645-651, 2000.
    pdf

  127. Cobot in LambdaMOO: A Social Statistics Agent
    by Charles Isbell, Michael Kearns, Dave Korman, Satinder Singh and Peter Stone.
    In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI), pages 36-41, 2000.
    pdf

  128. Automatic Optimization of Dialogue Management
    by Diane Litman, Michael Kearns, Satinder Singh and Marilyn Walker.
    In Proceedings of the 18th International Conference on Computational Linguistics (COLING), pages 502-508, 2000.
    pdf

  129. A Boosting Approach to Topic Spotting on Subdialogues
    by Kary Myers, Michael Kearns, Satinder Singh and Marilyn Walker.
    In Proceedings of the Seventeenth International Conference on Machine Learning (ICML) pages 655-662, 2000.
    pdf

  130. Eligibility Traces for Off-Policy Policy Evaluation
    by Doina Precup, Richard Sutton, and Satinder Singh.
    In Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pages 759-766, 2000.
    pdf

  131. Nash Convergence of Gradient Dynamics in General-Sum Games
    by Satinder Singh, Michael Kearns and Yishay Mansour.
    In Proceedings of the Sixteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 541-548, 2000.
    pdf

  132. Fast Planning in Stochastic Games
    by Michael Kearns, Yishay Mansour, and Satinder Singh
    In Proceedings of the Sixteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 309-316, 2000.
    pdf

  133. "Bias-Variance" Error Bounds for Temporal Difference Updates
    by Michael Kearns and Satinder Singh.
    In Proceedings of the Thirteenth Annual Conference on Computational Learning Theory (COLT), pages 142-147, 2000.
    pdf

  134. Reinforcement Learning for Spoken Dialogue Systems
    by Satinder Singh, Michael Kearns, Diane Litman and Marilyn Walker.
    In Advances in Neural Information Processing Systems 12 (NIPS), 2000.
    pdf

  135. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms
    by Satinder Singh, Tommi Jaakkola, Michael Littman, and Csaba Szpesvari.
    In Machine Learning Journal, vol 38(3), pages 287-308, 2000.
    pdf

  136. Policy Gradient Methods for Reinforcement Learning with Function Approximation
    by Richard Sutton, Dave McAllester, Satinder Singh and Yishay Mansour.
    In Advances in Neural Information Processing Systems 12 (NIPS), 2000.
    pdf

  137. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
    by Richard Sutton, Doina Precup and Satinder Singh.
    In Artificial Intelligence Journal, Volume 112, pages 181-211, 1999.
    pdf

  138. Approximate Planning for Factored POMDPs using Belief State Simplification
    by Dave McAllester and Satinder Singh.
    In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 409-416, 1999.
    pdf

  139. On the Complexity of Policy Iteration
    by Yishay Mansour and Satinder Singh.
    In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 401-408, 1999.
    pdf

  140. Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
    by Michael Kearns and Satinder Singh.
    In Advances in Neural Information Processing Systems 11 (NIPS), pages 996-1002, 1999.
    pdf

  141. Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes
    by John K. Williams and Satinder Singh.
    In Advances in Neural Information Processing Systems 11 (NIPS), pages 1073-1079, 1999.
    pdf

  142. Optimizing admission control while ensuring quality of service in multimedia networks via reinforcement learning
    by Timothy Brown, Hong Tong, and Satinder Singh.
    In Advances in Neural Information Processing Systems 11 (NIPS), pages 982-988, 1999.
    pdf

  143. Improved switching among temporally abstract actions
    by Richard Sutton, Satinder Singh, Doina Precup and Balaraman Ravindran.
    In Advances in Neural Information Processing Systems 11 (NIPS), pages 1066-1072, 1999.
    pdf

  144. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes
    by John Loch and Satinder Singh.
    In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 323-331, 1998.
    pdf

  145. Near-Optimal Reinforcement Learning in Polynomial Time
    by Michael Kearns and Satinder Singh.
    In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 260-268, 1998.
    pdf

  146. Intra-Option Learning about Temporally Abstract Actions
    by Richard Sutton, Doina Precup and Satinder Singh.
    In Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pages 556-564, 1998.
    pdf

  147. Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors
    by Doina Precup, Richard Sutton, and Satinder Singh.
    In Proceedings of the 10th European Conference on Machine Learning (ECML), pages 382-393. 1998.
    pdf

  148. Hierarchical Optimal Control of MDPs
    by Amy McGovern, Doina Precup, Balaraman Ravindran, Satinder Singh and Richard Sutton.
    In Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems, 1998.
    pdf

  149. How to Dynamically Merge Markov Decision Processes
    by Satinder Singh and David Cohn.
    In Advances in Neural Information Processing Systems 10 (NIPS), pages 1057-1063, 1998.
    pdf

  150. Analytical Mean Squared Error Curves for Temporal Difference Learning
    by Satinder Singh and Peter Dayan.
    In Machine Learning Journal, Volume 32, Issue 1, pages 5-40, 1998.
    pdf.
    A shorter version appears in the NIPS 9 Proceedings

  151. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems
    by Satinder Singh and Dimitri Bertsekas.
    In Advances in Neural Information Processing Systems 9 (NIPS), pages 974-980, 1997.
    pdf

  152. Planning with Closed-Loop Macro Actions
    by Doina Precup, Richard Sutton and Satinder Singh.
    In Proceedings of AAAI Fall Symposium on Model-directed Autonomous Systems, 1997.
    pdf

  153. Predicting Lifetimes in Dynamically Allocated Memory
    by David Cohn and Satinder Singh.
    In Advances in Neural Information Processing Systems 9 (NIPS), pages 939-945, 1997.
    pdf

  154. Analytical Mean Squared Error Curves for Temporal Difference Learning
    by Satinder Singh and Peter Dayan.
    In Advances in Neural Information Processing Systems 9 (NIPS), pages 1054-1060, 1997.
    pdf

  155. Reinforcement Learning with Replacing Eligibility Traces
    by Satinder Singh and Richard Sutton.
    In Machine Learning journal, Volume 22, Issue 1, pages 123-158, 1996.
    pdf abstract

  156. Learning Curve Bounds for Markov Decision Processes with Undiscounted Rewards
    by Lawrence Saul and Satinder Singh.
    In Proceedings of 9th Annual Conference on Computational Learning Theory (COLT), pages 147-156, 1996.
    pdf

  157. Long Term Potentiation, Navigation and Dynamic Programming
    by Peter Dayan and Satinder Singh.
    In Proceedings of Computation and Neural Systems Meeting (CNS) 1996.
    pdf

  158. Improving Policies Without Measuring Merits
    by Peter Dayan and Satinder Singh.
    In Advances in Neural Information Processing Systems 8 (NIPS), pages 1059-1065, 1996.
    pdf

  159. Markov Decision Processes in Large State Spaces
    by Lawrence Saul and Satinder Singh.
    In Proceedings of 8th Annual Workshop on Computational Learning Theory (COLT), pages 281-288, 1995.
    pdf

  160. Learning to Act using Real-Time Dynamic Programming
    by Andrew Barto, Steve Bradtke and Satinder Singh.
    In Artificial Intelligence, Volume 72, pages 81-138, 1995.
    pdf

  161. On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
    by Tommi Jaakkola, Michael Jordan and Satinder Singh.
    In Neural Computation, Volume 6, Number 6, pages 1185-1201, 1994.
    pdf

  162. Reinforcement Learning With Soft State Aggregation
    by Satinder Singh, Tommi Jaakkola and Michael Jordan.
    In Advances in Neural Information Processing Systems 7 (NIPS), pages 361-368, 1995.
    pdf

  163. Stochastic Convergence of Iterative DP Algorithms
    by Tommi Jaakkola, Michael Jordan and Satinder Singh.
    In Advances in Neural Information Processing Systems 6 (NIPS), pages 703-710, 1994.
    pdf

  164. Reinforcement Learning Algorithm for Partially Observable Markov Problems
    by Tommi Jaakkola, Satinder Singh and Michael Jordan.
    In Advances in Neural Information Processing Systems 7 (NIPS), pages 345-352, 1995.
    pdf

  165. Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes
    by Satinder Singh.
    In Proceedings of the Twelth National Conference on Artificial Intelligence (AAAI), pages 700-705, 1994.
    pdf

  166. Learning Without State-Estimation in Partially Observable Markovian Decision Processes
    by Satinder Singh, Tommi Jaakkola and Michael Jordan.
    In Machine Learning: Proceedings of the Eleventh International Conference (ICML), pages 284-292, 1994.
    pdf

  167. On Step-Size and Bias in Temporal-Difference Learning
    by Richard Sutton and Satinder Singh.
    In Proceedings of Eighth Yale Workshop on Adaptive and Learning Systems, 1994.
    pdf abstract

  168. Robust Reinforcement Learning in Motion Planning
    by Satinder Singh, Andrew Barto, Roderic Grupen, and Christopher Connolly.
    In Advances in Neural Information Processing Systems 6 (NIPS), pages 655-662, 1994.
    pdf

  169. An Upper Bound on the Loss from Approximate Optimal-Value Functions
    by Satinder Singh and Richard Yee.
    In Machine Learning, Volume 16, Issue 3, pages 227-233, 1994.
    pdf

  170. Distributed Representation of Limb Motor Programs in Arrays of Adjustable Pattern Generators
    by Neil Berthier, Satinder Singh, Andrew Barto, and Jim Houk.
    In Journal of Cognitive Neuroscience, vol 5:1, pages 56-78, 1993.
    pdf

  171. Reinforcement Learning with a Hierarchy of Abstract Models
    by Satinder Singh.
    In Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI), pages 202-207, 1992.
    pdf

  172. A Cortico-Cerebellar model that learns to generate distributed motor commands to control a kinetic arm
    by Satinder Singh, Neil Berthier, Andrew Barto, and Jim Houk.
    In Advances in Neural Information Processing Systems 4 (NIPS), pages 611-618, 1992.
    pdf

  173. Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models
    by Satinder Singh.
    In Proceedings of the Ninth Machine Learning Conference, pages 406-415, 1992.
    pdf

  174. Transfer of Learning by Composing Solutions of Elemental Sequential Tasks
    by Satinder Singh.
    In Machine Learning Journal, Volume 8, Issue 3, pages 323-339, 1992.
    pdf

  175. The Efficient Learning of Multiple Task Sequences
    by Satinder Singh.
    In Advances in Neural Information Processing Systems 4 (NIPS), pages 251-258, 1992.
    pdf

  176. Transfer of Learning Across Compositions of Sequential Tasks
    by Satinder Singh.
    In Machine Learning: Proceedings of the Eighth International Workshop, pages 348-352, 1991.
    pdf

  177. Reinforcement Learning and Dynamic Programming
    by Andrew Barto and Satinder Singh.
    In Proceedings of Sixth Yale Workshop on Adaptive and Learning Systems, 1990.

Magazine Articles, Book Chapters and Others

  • Value-Driven Procurement in the TAC Supply Chain Game
    by Christopher Kiekintveld, Michael P. Wellman, Satinder Singh, and Vishal Soni. SIGecom Exchanges, Volume4.3, pages 9-19, 2004.
    pdf

  • Reinforcement Learning for 3 vs. 2 Keepaway
    by Peter Stone and R. Sutton and Satinder Singh.
    In RoboCup-2000: Robot Soccer World Cup IV, P. Stone, T. Balch, and G. Kraetszchmar, Eds., Springer Verlag.
    pdf.
    An earlier version appeared in the Proceedings of the RoboCup-2000 Workshop, Melbourne, Australia

  • Soft Dynamic Programming Algorithms: Convergence Proofs
    by Satinder Singh.
    In Proceedings of Workshop on Computational Learning and Natural Learning (CLNL), Provincetown, Massachusetts, 1993.
    pdf

  • On the Computational Economics of Reinforcement Learning
    by Andrew Barto and Satinder Singh.
    In Proceedings of Connectionist Summer School, 1990.
    pdf

  • An Adaptive Sensorimotor Network Inspired by the Physiology of the Cerebellum
    by Jim Houk, Satinder Singh, Charles Fisher, and Andrew Barto.
    Appears as a chapter in WT Miller, RS Sutton, and PJ Werbos, editors, Neural Network for Control, pages 301-348, 1989.

    My one paper in a non-technical journal!

  • How to Make Software Agents Do the Right Thing: An Introduction to Reinforcement Learning
    by Satinder Singh, Peter Norvig and David Cohn.
    In Dr. Dobbs journal, March issue, 1997.

    pdf
    [html version]

    An Almost Tutorial on RL (extracted from my Thesis)

  • An (Almost) Tutorial on Reinforcement Learning
    . gzipped postscript. Extracted from my 1993 thesis

    Going Nowhere Papers

  • Asynchronous Modified Policy Iteration with Single-sided Updates
    . Satinder Singh and Vijay Gullapalli. Working Paper, 1993.
    pdf