Learning compositional policies via hierarchical clustering

On February 23, 2021 at 12:00 pm till 1:00 pm
Rex Liu (Brown University)

Zoom link: https://mit.zoom.us/j/96471630432

Artificial agents have demonstrated tremendous success on a wide variety of tasks, at times even surpassing human performance. Yet they still show limited ability to generalise beyond the narrow settings in which they were trained, even when the new contexts are meaningfully similar to the old ones. Indeed, one of the hallmarks of natural intelligence is our ability to rapidly solve new tasks by leveraging prior knowledge. One way humans accomplish this is by abstracting out a latent structure from the task that can then be transferred over to new contexts. Moreover, if the task has a compositional structure, humans will flexibly recombine familiar structural components in novel ways to quickly solve the new task. For instance, a musician learning the piano will recognise that fingerings are independent of songs. And when learning to play another keyboard instrument like the organ or harpsichord, she can readily transfer her knowledge of piano fingerings over while learning songs specific to the new instrument. However, the extent to which task components should be learnt independently or treated as a joint unit depends on task statistics. Should components be highly correlated with each other, it may be more advantageous to learn them as a single joint unit, but this comes at the expense of greater transferability. Previous work has explored the two extremes where components are either learnt jointly as a single unit or separately as independent components. Drawing on methods from hierarchical non-parametric Bayesian inference, here, we present work where the agent is able to learn about each component separately but also about the variety of correlational structures that may bind them together. We show that this agent is highly expressive and can readily adapt to a broad range of environment statistics with varying degrees of jointness or independence between structural components.

Zoom