2018+ Papers

Go back to publications main page.

  • Pairwise Weights for Temporal Credit Assignment
    by Zeyu Zheng, Risto Vuorio, Richard Lewis, and Satinder Singh.
    In 36th AAAI Conference on Artificial Intelligence, 2022
    arXiv version.

  • Reward is Enough
    by David Silver, Satinder Singh, Doina Precup, and Richard Sutton.
    In Artificial Intelligence, vol 299, 2021
    pdf.

  • On the Expressivity of Markov Reward
    by David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, and Satinder Singh.
    In Neural Information Processing Systems (NeurIPS), 2021
    Outstanding Paper Award
    pdf.

  • Proper Value Equivalence
    by Christopher Grimm, Andre Barreto, Gregory Farquhar, David Silver, and Satinder Singh.
    In Neural Information Processing Systems (NeurIPS), 2021
    arXiv version.

  • Discovery of Options via Meta-Learned Subgoals
    by Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, and Satinder Singh.
    In Neural Information Processing Systems (NeurIPS), 2021
    arXiv version.

  • Learning State Representations from Random Deep Action-Conditional Predictions
    by Zeyu Zheng, Vivek Veeriah, Risto Vuorio, Richard Lewis, and Satinder Singh.
    In Neural Information Processing Systems (NeurIPS), 2021
    arXiv version.

  • Reward is Enough for Convex MDPs
    by Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, and Satinder Singh.
    In Neural Information Processing Systems (NeurIPS), 2021
    arXiv version.

  • Reinforcement Learning of Implicit and Explicit Control Flow Instructions
    by Ethan Brooks, Janarthanan Rajendran, Richard Lewis, and Satinder Singh.
    In International Conference on Machine Learning (ICML), 2021
    arXiv version.

  • Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-Person Simulated 3D Environment
    by Wilka Carvalho, Anthony Liang, Kimin Lee, Honglak Lee, Richard Lewis, and Satinder Singh.
    In International Joint Conference on Artificial Intelligence (IJCAI), 2021
    arXiv version.

  • Rational use of episodic and working memory: A normative account of prospective memory
    by Ida Mommennejad, Jarrod Lewis-Peacock, Kenneth A. Normal, Jonathan D. Cohen, Satinder Singh, and Richard L. Lewis.
    In Neuropsychologia, vol 158, 2021
    pdf.

  • Efficient Querying for Cooperative Probabilistic Commitments
    by Qi Zhang, Edmund Durfee, and Satinder Singh.
    In 35th AAAI Conference on Artificial Intelligence (AAAI), 2021
    arXiv version.

  • The Value Equivalence Principle for Model-Based Reinforcement Learning
    by Christopher Grimm, Andre Barreto, Satinder Singh, and David Silver.
    In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
    arXiv version.

  • Discovering Reinforcement Learning Algorithms
    by Junhyuk Oh, Matteo Hessel, Wojciech Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, and David Silver.
    In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
    arXiv version.

  • Meta-Gradient Reinforcement Learning with an Objective Discovered Online
    by Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, and David Silver.
    In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
    arXiv version.

  • Learning to No-Press Diplomacy with Best Response Policy Iteration
    by Thomas Anthony, Tom Eccles, Andrea Tacchetti, Janos Kramar, Ian Gemp, Thomas Hudson, Nicolas Porcel, Marc Lanctot, Julien Perolat, Richard Everett, Roman Werpachowski, Satinder Singh, Thore Graepel, and Yoram Bachrach.
    In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
    arXiv version.

  • A Self-Tuning Actor-Critic Algorithm
    by Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, and Satinder Singh.
    In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
    arXiv version.

  • On Efficiency in Hierarchical Reinforcement Learning
    by Zheng Wen, Doina Precup, Morteza Ibrahimi, Andre Barreto, Benjamin Van Roy, and Satinder Singh.
    In Thirty Fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
    pdf.

  • What can Learned Intrinsic Rewards Capture?
    by Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, and Satinder Singh.
    In International Conference on Machine Learning (ICML), 2020.
    arxiv version.

  • Sample Complexity of Reinforcement Learning Using Linearly Combined Model Ensembles
    by Aditya Modi, Nan Jiang, Ambuj Tewari, and Satinder Singh.
    In International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
    arXiv version.

  • How Should An Agent Practice?
    by Janarthanan Rajendran, Richard Lewis, Vivek Veeriah, Honglak Lee, and Satinder Singh.
    In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), 2020.
    pdf.

  • Modeling Probabilistic Commitments for Maintainance is Inherently Harder than for Achievement
    by Qi Zhang, Edmund Durfee, and Satinder Singh.
    In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), 2020.
    pdf.

  • Querying to Find a Safe Policy under Uncertain Safety Constraints in Markov Decision Processes
    by Shun Zhang, Edmund Durfee, and Satinder Singh.
    In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), 2020.
    pdf.

  • Discovery of Useful Questions as Auxiliary Tasks
    by Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hsselt, David Silver, and Satinder Singh.
    In Neural Information Processing Systems (NeurIPS), 2019.
    arxiv version.

  • Behavior Suite for Reinforcement Learning
    by Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, and Hado Van Hasselt.
    In Neural Information Processing Systems (NeurIPS), 2019.
    arxiv version.

  • Hindsight Credit Assignment
    by Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheslaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hsselt, Gregory Wayne, Satinder Singh, Doina Precup, and Remi Munos.
    In Neural Information Processing Systems (NeurIPS), 2019.
    pdf.

  • No Press Diplomacy: Modeling Multi-Agent Gameplay
    by Philip Paquette, Yuchen Lu, Steven Bocco, Max ). Smith, Satya Ortiz-Gagne, Jonathan K. Kummerfeld, Satinder Singh, Joelle Pineau, and Aaron Courville.
    In Neural Information Processing Systems (NeurIPS), 2019.
    arxiv version.

  • Disentangled Cumulants Help Succesor Representations Transfer to New Tasks
    by Christopher Grimm, Irina Higgins, Andre Barreto, Denis Teplyashin, Markus Wulfmeier, Tim Hertweck, Raia Hadsell, and Satinder Singh.
    arxiv.

  • Deep Reinforcment Learning for Dynamic Multi-Driver Dispatching and Repositioning Problem
    by John Holler, Risto Vuorio, Tiancheng Jin, Satinder Singh, Zhiwei Qin, Jieping Ye, Xiaocheng Tan, Yan Jiao, and Chenxi Wang.
    In International Conference on Data Mining (ICDM-Short Paper), 2019.
    pdf.

  • Learning Independently-Obtainable Reward Functions
    by Christopher Grimm and Satinder Singh.
    arXiv version.

  • Many-Goals Reinforcement Learning
    by Vivek Veeriah, Junhyuk Oh, and Satinder Singh.
    arXiv version.

  • Learning to Communicate and Solve Visual Blocks-World Tasks
    by Qi Zhang, Richard Lewis, Satinder Singh, and Edmund Durfee.
    In Thirty-Third AAAI Conference on Artificial Intelligence (AAAI), 2019.
    pdf.

  • On Learning Intrinsic Rewards for Policy Gradient Methods
    by Zeyu Zheng, Junhyuk Oh, and Satinder Singh.
    In Neural Information Processing Systems (NIPS), 2018.
    arXiv version.

  • Generative Adversarial Self-Imitation Learning
    by Yijie Guo, Junhyuk Oh, Satinder Singh, and Honglak Lee.
    In Neural Information Processing Systems (NeurIPS), 2018.
    arXiv version.

  • Completing State Representations Using Spectral Learning
    by Nan Jiang, Alex Kulesza, and Satinder Singh.
    In Neural Information Processing Systems (NIPS), 2018.
    pdf.

  • Learning End-to-End Goal-Oriented Dialog with Multiple Answers
    by Janarthanan Rajendran, Jatin Ganhotra, Satinder Singh, and Lazaros Polymenakos.
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.
    pdf.

  • Self-Imitation Learning
    by Junhyuk Oh, Yijie Guo, Satinder Singh, and Honglak Lee.
    In International Conference on Machine Learning (ICML), 2018.
    arXiv version.

  • Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes
    by Shun Zhang, Edmund Durfee, and Satinder Singh.
    In International Joint Conference on Artificial Intelligence (IJCAI), 2018.
    pdf.

  • Markov Decision Processes with Continuous Side Information
    by Aditya Modi, Nan Jiang, Satinder Singh, and Ambuj Tewari.
    In International Conference on Algorithmic Learning Theory (ALT), 2018.
    conf pdf, arXiv link.

  • The Advantage of Doubling: A Deep Reinforcement Learning Approach to Studying the Double Team in the NBA
    by Jiaxuan Wang, Ian Fox, Jonathan Skaza, Nick Linck, Satinder Singh, and Jenna Wiens.
    In Sloan Sports Analytics Conference, 2018.
    arXiv link.

    All My Papers in Reverse Chronological Order