Friday, November 11, 2022
10:20 – 11:10 a.m. (CST)
Virtual via Zoom: https://tamu.zoom.us/j/93347193479 (password in emails or syllabus)
Dr. Mohammad Ghavamzadeh
Senior Staff Research Scientist
Google
Title: “Mitigating the Risk Associated with Epistemic and Aleatory Uncertainties in MDPs”
Abstract
Prior work on safe reinforcement learning (RL) has studied risk-aversion to randomness in dynamics (aleatory) and to model uncertainty (epistemic) in isolation. We propose and analyze a new framework to jointly model the risk associated with epistemic and aleatory uncertainties in finite-horizon and discounted infinite-horizon MDPs. We call this framework that combines risk-averse and soft-robust methods RASR. We show that when the risk-aversion is defined using either the entropic value-at-risk (EVaR) or the entropic risk measure (ERM), the optimal policy in RASR can be computed efficiently using a new dynamic program formulation with a time-dependent risk level. As a result, the optimal risk-averse policies are deterministic but time-dependent, even in the infinite-horizon discounted setting. We also show that particular RASR objectives reduce to risk-averse RL with mean posterior transition probabilities. Our empirical results show that our new algorithms consistently mitigate uncertainty as measured by EVaR and other standard risk measures.
Biography
Dr. Mohammad Ghavamzadeh received a Ph.D. degree from UMass Amherst in 2005. He was a postdoctoral fellow at UAlberta from 2005 to 2008. He was a permanent researcher at INRIA from 2008 to 2013. He was the recipient of the “INRIA award for scientific excellence” in 2011, and obtained his Habilitation in 2014. Since 2013, he has been a senior researcher at Adobe and FAIR, and now a senior staff research scientist at Google. He has published over 100 refereed papers in major machine learning, AI, and control journals and conferences. He has co-chaired more than 10 workshops and tutorials at NeurIPS, ICML, and AAAI. His research has been mainly focused on the areas of reinforcement learning, bandit algorithms, and recommendation systems.
More information on Dr. Ghavamzadeh can be found at
https://mohammadghavamzadeh.github.io/
https://scholar.google.ca/citations?user=Bo-wyrkAAAAJ&hl=en
More info. on past and future CESG Seminars at CESG Seminars (tamu.edu)
* Friday, 11/11/22 at 10:20 a.m. via Zoom *