Back to activities
DS4DM Coffee Talk

Representation-driven Option Discovery in Reinforcement Learning

iCalendar

Aug 23, 2023   03:00 PM — 04:00 PM

Marlos C. Machado University of Alberta, Canada

Marlos C. Machado

Presentation on YouTube.

The ability to reason at multiple levels of temporal abstraction is a fundamental aspect of intelligence. In reinforcement learning, this attribute is often modeled through temporally extended courses of actions called options. Despite the popularity of options as a research topic, they are seldom included as an explicit component in traditional solutions within the field. In this talk, I will try to provide an answer for why this is the case and emphasize the vital role options can play in continual learning. Rather than assuming a predetermined set of options, I will introduce a general framework for option discovery, which utilizes the agent's representation to discover useful options. By leveraging these options to generate a rich stream of experience, the agent can improve its representations and learn more effectively. This representation-driven option discovery approach creates a virtuous cycle of refinement, continuously improving both the representation and options, and it is particularly effective for problems that require agents to exhibit different levels of abstractions to succeed.


Bio: Marlos is an assistant professor at the University of Alberta. Marlos's research interests lie broadly in machine learning, specifically in (deep) reinforcement learning, representation learning, continual learning, and real-world applications of all the above. He completed his B.Sc. and M.Sc. at UFMG, Brazil, and his Ph.D. at the University of Alberta. During his Ph.D., among other things, he popularized the idea of temporally-extended exploration through options, introducing the idea of eigenoptions. He was a researcher at DeepMind and at Google Brain for four years; during which time he made several contributions to reinforcement learning, including the application of deep reinforcement learning to control Loon's stratospheric balloons.

Federico Bobbio organizer
Defeng Liu organizer

Location

Hybrid activity at GERAD
Zoom et salle 4488
Pavillon André-Aisenstadt
Campus de l'Université de Montréal
2920, chemin de la Tour

Montréal Québec H3T 1J4
Canada

Associated organization