Skip Navigation

ECE Departmental Seminar

Learning for Optimal Control with Robustness Guarantees: Global Convergence and Implicit Regularization of Policy Optimization

Kaiqing Zhang

Monday, 10/7/19, 11:30am
Light Engineering 250

AbstractDespite the tremendous empirical success of policy optimization (PO) in (deep) reinforcement learning (RL), its theoretical foundations have not been fully investigated, especially in the multi-agent and robust control settings. In this talk, we study the convergence theory of policy optimization for two simple but benchmark control problems: zero-sum linear quadratic games, and mixed H2/H control design, which are both closely related to robust control synthesis. In particular, we discuss the landscape of both problems from an optimization perspective, and develop policy optimization algorithms for finding the solutions, i.e., the Nash equilibrium of the former, and the optimal control subject to robustness constraint of the latter. We show that our PO algorithms can converge to the global solutions with any valid initialization, namely, enjoy the “global convergence” property, in spite of the nonconvexity of the problems. Simulations results are also provided to corroborate our theory. Our work serves as an initial step of understanding the theoretical aspects of policy-based RL algorithms for both multi-agent RL in general, and H control synthesis.

Bio: Kaiqing Zhang (S‘16) received B.Eng. degree from Department of Automation of Tsinghua University, Beijing, China in 2015, and M.S. degrees in both Electrical and Computer Engineering and Applied Mathematics in the University of Illinois at Urbana-Champaign (UIUC), Urbana, IL, USA, in 2017. He is currently pursuing Ph.D. degree at the Department of Electrical and Computer Engineering in UIUC. His current research interests are reinforcement learning, game theory, optimization in multi-agent/networked systems, and robust control, with applications in cyber-physical systems including smart grid and electricity markets, transportation networks, and robotics.