Memory-based learning of effective actions

RI EAGER:
Memory-based learning of effective actions

Benjamin Kuipers, PI
University of Michigan

EAGER grant (IIS-1252987) from NSF Robust Intelligence program, 2012-2014.

Project Summary

A developing agent such as a human baby (or, in our proposed work, a learning robot) acts apparently randomly at first, but rapidly learns increasingly purposeful and effective actions. We propose to develop computational models of how this learning process could take place, implementing and testing these computational models on a learning robot. Our learning agent is an embodied agent, embedded in an environment (initially simulated, progressing to physical implementation on existing robots in our laboratory). The agent starts with an uninterpreted sense vector and motor vector. Prior stages of learning (Pierce & Kuipers, 1997; Modayil & Kuipers, 2008) provide the agent with the geometry of its sensors, and the ability to individuate and track moving objects within its sensory field of view. Our goal is to find a general way for such a learning agent to learn effective actions, ranging from learning to use a hand to manipulate objects on a table-top, to learning to move through a complex environment without collisions with walls or pedestrians, to learning to balance and walk.

We draw on insights from two superficially different projects that have complementary strengths. The QLAP system (Mugan, 2010; Mugan & Kuipers, 2012) exploits a qualitative abstraction of continuous sensor input, learning causal contingencies, DBN and MDP models of the causal world, and building a hierarchy of action models. The MPEPC system (Park & Kuipers, 2011, 2012) factors the continuous navigation problem for a mobile robot into a local unconstrained control and a global optimization process that balances constraints such as progress and collision avoidance. Both methods have a local phase (learning contingencies and local control laws), and a global phase (learning a hierarchy of actions, and finding extended routes that balance constraints). By reformulating these two methods within a common memory-based framework, we believe we can combine their strengths and overcome their limitations.

We plan to use memory-based (case-based) learning methods, that highlight common properties of these two methods, and that allow features of the presenting case to retrieve related cases from the knowledge base. The lowest-level representation is a simple feature vector --- in the case of local motion control, it specifies the target pose location in the egocentric frame of reference, along with the parameters of the motion control law that attempts to reach it, and the quality of the resulting trajectory. Retrieval is done by Nearest Neighbor, combining information from the retrieved cases by Locally Weighted Regression (Atkeson, et al, 1997) or its successor, Locally Weighted Projection Regression (Vijayakumar, et al, 2005). At the higher level of action learning, a case is described by identifying the critical environmental constraints that determine the global structure of the action.

Our experiments will focus on learning a hierarchy of progressively more sophisticated actions in the two domains covered by the previous projects: manipulation of blocks on a table top, and navigation planning in an incompletely known dynamic environment.

Intellectual Merit: This project combines insights from two different approaches for learning and representing effective action in complex and partially known domains. We will use sophisticated memory-based methods for identifying and combining relevant knowledge stored in episodic memory about previous action attempts. The resulting action models become part of the foundation for commonsense knowledge in an intelligent agent.

Broader Impact: This project will train a graduate student in computer science, robotics, computer vision, machine learning, human modeling, and control, helping to meet critical national needs. A better understanding of developmental learning of skilled and robust action has implications for the understanding of learning disabilities. This could lead, in the long run, to advances in both diagnosis and remediation of learning disabilities. We have also found that robotics is a very effective topic for outreach to the general public, including encouraging students to further education in STEM fields.

Publications

Grace Tsai. 2014.
On-line, incremental visual scene understanding for an indoor navigating robot.
Doctoral dissertation, Department of Electrical Engineering and Computer Science, University of Michigan.
Grace Tsai and Benjamin Kuipers. 2014.
Handling perceptual clutter for robot vision with partial model-based interpretations.
IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS), 2014.
Grace Tsai, Collin Johnson, and Benjamin Kuipers. 2014.
Semantic visual understanding of indoor environments: from structures to opportunities for action.
Vision Meets Cognition Workshop (CVPR), 2014.
Grace Tsai and Benjamin Kuipers. 2013.
Focusing attention on visual features that matter.
British Machine Vision Conference (BMVC), 2013.
Jong Jin Park and Benjamin Kuipers. 2013.
Autonomous person pacing and following with Model Predictive Equilibrium Point Control.
IEEE Int. Conf. on Robotics and Automation (ICRA-13).
Paul Foster, Zhenghong Sun, Jong Jin Park and Benjamin Kuipers. 2013.
VisAGGE: Visible Angle Grid for Glass Environments.
IEEE Int. Conf. on Robotics and Automation (ICRA-13).

Related Recent Publications

These are related, but prior to this project, or supported by other funding.

Jong Jin Park, Collin Johnson, and Benjamin Kuipers. 2012.
Robot Navigation with Model Predictive Equilibrium Point Control.
IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 2012.
Jonathan Mugan and Benjamin Kuipers. 2012. Autonomous learning of high-level states and actions in continuous environments.
IEEE Transactions on Autonomous Mental Development 4(1): 70-86, 2012.
Jong Jin Park and Benjamin Kuipers. 2011. A smooth control law for graceful motion of differential wheeled mobile robots in 2D environments. IEEE Int. Conf. on Robotics and Automation (ICRA-11).
Shilpa Gulati. 2011. A Framework for Characterization and Planning of Safe, Comfortable, and Customizable Motion of Assistive Mobile Robots. Doctoral dissertation, Mechanical Engineering Department, University of Texas at Austin.
S. Gulati, C. Jhurani, B. Kuipers and R. Longoria. 2009. A framework for planning comfortable and customizable motion of an assistive mobile robot. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-09).
Shilpa Gulati and Benjamin Kuipers. 2008. High Performance Control for Graceful Motion of an Intelligent Wheelchair. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA-08).

The full set of papers on our intelligent wheelchair research is available.

This work has taken place in the Intelligent Robotics Lab in the Computer Science and Engineering Division of the Electrical Engineering and Computer Science Department at the University of Michigan. Research of the Intelligent Robotics lab is supported in part by grant IIS-1252987 from the National Science Foundation.

BJK

RI EAGER: Memory-based learning of effective actions

Project Summary

Publications

Related Recent Publications

RI EAGER:
Memory-based learning of effective actions