Decision Awareness in Reinforcement Learning : GERAD

iCalendar

7 nov. 2022 11h00 — 12h00

Pierre-Luc Bacon – Université de Montréal, Canada

Pierre-Luc Bacon

Decision awareness is the learning principle according to which the components of a learning system ought to be optimized directly to satisfy the global performance criterion: to produce optimal decisions. This end-to-end perspective has recently led to significant advances in model-based reinforcement learning by addressing the problem of compounding errors plaguing alternative approaches. In this talk, I will present some of our recent work on this topic: 1. on learning control-oriented transition models by implicit differentiation and 2. on learning neural ordinary differential equations end-to-end for nonlinear trajectory optimization. Along the way, we will also discuss some of the computational challenges associated with those methods and our attempts at scaling up performance, specifically: using an efficient factorization of the Jacobians in the forward mode of automatic differentiation through novel constrained optimizers inspired by adversarial learning.

Federico Bobbio responsable

Defeng Liu responsable

Léa Ricard responsable

Lieu

Activité hybride au GERAD

Zoom et salle 4488
Pavillon André-Aisenstadt
Campus de l'Université de Montréal
2920, chemin de la Tour
Montréal Québec H3T 1J4
Canada

GERAD

Decision Awareness in Reinforcement Learning

7 nov. 2022 11h00 — 12h00

Pierre-Luc Bacon – Université de Montréal, Canada

Lieu

Organisme associé

Chaire d’excellence en recherche du Canada sur la science des données pour la prise de décision en temps réel