Data: 4 de abril 2023
Hora: 13h
Local: Anfiteatro PA2, Técnico – Campus Alameda
Orador: Zita Marinho (Deepmind)
Título: “Model-Value Self-Consistent Updates and Applications”
Resumo:
«Learned models of the environment provide reinforcement learning agents with flexible ways of making predictions about the environment. Models enable planning, i.e. using more computation to improve value functions or policies, without requiring additional environment interactions. In this talk, we investigate a way of augmenting model-based RL, by additionally encouraging a learned model and value function to be jointly self-consistent. This work covers possible ways to use self-consistency updates both for policy evaluation and control (Farquhar et al 20), as well as a proxy for epistemic uncertainty in exploration (Filos et al. 22).»
Nota Biográfica:
«Zita Marinho is a Research Scientist at Deepmind, where she is currently working on reinforcement learning. She holds a dual PhD/MSc in Robotics from the Robotics Institute, and from IST University of Lisbon as part of the CMU/Portugal program. She graduated from her MSc. degree in Physics Engineering from Instituto Superior Técnico, Universidade de Lisboa in 2010. Her research interests lie in the intersection of machine learning algorithms and Natural Language Processing. She is particularly interested in studying how agents can interact and learn more effectively from those interactions. She studied during her PhD spectral algorithms for sequence prediction and planning. She was jointly advised by Prof. André Martins at Unbabel/IST, Prof. Geoffrey Gordon at the Machine Learning Department/CMU and Prof. Siddhartha Srinivasa from University of Washington.»
A Priberam integra a Comunidade IST Spin-Off®.
Os “Priberam Machine Learning Lunch Seminars” são de entrada livre, mediante inscrição prévia.