Memory-based learning of effective actions

Benjamin Kuipers, PI
University of Michigan

EAGER grant (IIS-1252987) from NSF Robust Intelligence program, 2012-2014.

Project Summary

A developing agent such as a human baby (or, in our proposed work, a learning robot) acts apparently randomly at first, but rapidly learns increasingly purposeful and effective actions. We propose to develop computational models of how this learning process could take place, implementing and testing these computational models on a learning robot. Our learning agent is an embodied agent, embedded in an environment (initially simulated, progressing to physical implementation on existing robots in our laboratory). The agent starts with an uninterpreted sense vector and motor vector. Prior stages of learning (Pierce & Kuipers, 1997; Modayil & Kuipers, 2008) provide the agent with the geometry of its sensors, and the ability to individuate and track moving objects within its sensory field of view. Our goal is to find a general way for such a learning agent to learn effective actions, ranging from learning to use a hand to manipulate objects on a table-top, to learning to move through a complex environment without collisions with walls or pedestrians, to learning to balance and walk.

We draw on insights from two superficially different projects that have complementary strengths. The QLAP system (Mugan, 2010; Mugan & Kuipers, 2012) exploits a qualitative abstraction of continuous sensor input, learning causal contingencies, DBN and MDP models of the causal world, and building a hierarchy of action models. The MPEPC system (Park & Kuipers, 2011, 2012) factors the continuous navigation problem for a mobile robot into a local unconstrained control and a global optimization process that balances constraints such as progress and collision avoidance. Both methods have a local phase (learning contingencies and local control laws), and a global phase (learning a hierarchy of actions, and finding extended routes that balance constraints). By reformulating these two methods within a common memory-based framework, we believe we can combine their strengths and overcome their limitations.

We plan to use memory-based (case-based) learning methods, that highlight common properties of these two methods, and that allow features of the presenting case to retrieve related cases from the knowledge base. The lowest-level representation is a simple feature vector --- in the case of local motion control, it specifies the target pose location in the egocentric frame of reference, along with the parameters of the motion control law that attempts to reach it, and the quality of the resulting trajectory. Retrieval is done by Nearest Neighbor, combining information from the retrieved cases by Locally Weighted Regression (Atkeson, et al, 1997) or its successor, Locally Weighted Projection Regression (Vijayakumar, et al, 2005). At the higher level of action learning, a case is described by identifying the critical environmental constraints that determine the global structure of the action.

Our experiments will focus on learning a hierarchy of progressively more sophisticated actions in the two domains covered by the previous projects: manipulation of blocks on a table top, and navigation planning in an incompletely known dynamic environment.

Intellectual Merit: This project combines insights from two different approaches for learning and representing effective action in complex and partially known domains. We will use sophisticated memory-based methods for identifying and combining relevant knowledge stored in episodic memory about previous action attempts. The resulting action models become part of the foundation for commonsense knowledge in an intelligent agent.

Broader Impact: This project will train a graduate student in computer science, robotics, computer vision, machine learning, human modeling, and control, helping to meet critical national needs. A better understanding of developmental learning of skilled and robust action has implications for the understanding of learning disabilities. This could lead, in the long run, to advances in both diagnosis and remediation of learning disabilities. We have also found that robotics is a very effective topic for outreach to the general public, including encouraging students to further education in STEM fields.


Related Recent Publications

These are related, but prior to this project, or supported by other funding.

The full set of papers on our intelligent wheelchair research is available.

This work has taken place in the Intelligent Robotics Lab in the Computer Science and Engineering Division of the Electrical Engineering and Computer Science Department at the University of Michigan. Research of the Intelligent Robotics lab is supported in part by grant IIS-1252987 from the National Science Foundation.