Friday, April 1, 2022
4:10 – 5:00 p.m.
Dr. Siva Theja Maguluri
Assistant Professor in the H. Milton Stewart School of Industrial and Systems Engineering, Georgia Tech
- Lyapunov methods for Stochastic Approximation and Reinforcement learning
- Generalized Moreau Envelop based on infimal convolution smoothing as a Lyapunov function for Stochastic Approximation of contractive operators
- Unified framework to obtain sample complexity of a large class of RL algorithms
- Linear speedup in the number of agents for Federated Reinforcement learning
“A Lyapunov Theory of Finite-Sample Guarantees of Stochastic Approximation and Reinforcement Learning”
The focus of our work is to obtain finite-sample and/or finite-time convergence bounds of various model-free Reinforcement Learning (RL) algorithms. Many RL algorithms involve solving the Bellman fixed point equation, which is done using Stochastic Approximation (SA). SA is a popular approach for solving fixed point equations when the information is corrupted by noise. We develop a Lyapunov framework and obtain mean square error bounds on the convergence of a general class of SA algorithms for contractive operators under general norms and Markovian noise. The key tool we use is generalized Moreau envelope as a smooth potential/ Lyapunov function. These powerful results immediately provide sample complexity results of a large class of RL algorithms including TD learning, Q-learning, actor-critic algorithms, their off-policy variants, and their distributed variants. The talk will present a couple of these applications in off-policy RL and/or Federated RL.
Siva Theja Maguluri is Fouts Family Early Career Professor and Assistant Professor in the H. Milton Stewart School of Industrial and Systems Engineering at Georgia Tech. He obtained his Ph.D. and MS in ECE as well as MS in Applied Math from UIUC, and B.Tech in Electrical Engineering from IIT Madras. His research interests span the areas of Control, Optimization, Algorithms and Applied Probability. In particular, he works on Reinforcement Learning theory, scheduling, resource allocation and revenue optimization problems that arise in a variety of systems including Data Centers, Cloud Computing, Wireless Networks, Block Chains, Ride hailing systems, etc. His research and teaching are recognized through several awards including the “Best Publication in Applied Probability” award, NSF CAREER award, second place award at INFORMS JFIG best paper competition, Student best paper award at IFIP Performance, “CTL/BP Junior Faculty Teaching Excellence Award,” and “Student Recognition of Excellence in Teaching: Class of 1934 CIOS Award.”
Dr. Maguluri: https://sites.google.com/site/sivatheja/
On Zoom @ 4:10 p.m. on Friday, 4/1/22
Join Zoom Meeting
Meeting ID: 963 4348 1647