[ home ] [ people ] [ projects ] [ courses ] [ meetings ]


Shaping and policy search in reinforcement learning and On Spectral Clustering: Analysis and an algorithm


Monday, March 25th -- Andrew Ng


Abstract:
 

I will present a talk in two parts:

Shaping and policy search in reinforcement learning

In the first part of the talk, I will present two ideas that have enabled the successful application of reinforcement learning in the domains of four-legged robot locomotion and the control of helicopter flight.

In reinforcement learning, "shaping" refers to the important practice of giving a learning algorithm "hints" by modifying the reward function. While often crucial to making learning tractable, shaping frequently changes the problem in unanticipated ways that cause poor solutions to be learned. In this talk, I will present a theory of shaping that shows how these problems can be eliminated, and also give guidelines for designing good shaping functions that in practice result in significant speedups of the learning process.

A second issue in reinforcement learning is that naive algorithms often scale exponentially in the number of state variables, and are thus frequently impractical. I will present PEGASUS, a method for evaluating and finding good controllers. The key insight of this method is that, when using a computer simulation to evaluate policies, we can use the same set of random samples to evaluate different policies. This leads to an efficient algorithm---one whose ``sample complexity'' is polynomial, rather than exponential, in the dimension of the problem---for which we can give strong performance guarantees.

On Spectral Clustering: Analysis and an algorithm

Despite several empirical successes of spectral clustering methods---algorithms that cluster points using eigenvectors of matrices derived from the data---there are several unresolved issues. First, there are a wide variety of algorithms that use the eigenvectors in slightly different ways. Second, many of these algorithms have no proof that they will actually compute a reasonable clustering. In this part of the talk, I present a simple spectral clustering algorithm that can be implemented using a few lines of Matlab. Applying tools from matrix perturbation theory, I also give conditions under which it can be expected to perform well. I also present some surprisingly good experimental results of the algorithm on a number of challenging clustering problems.