Maximizing Exploration in Unknown Dynamical
Environments with (Potential) Attractors
Exploration of an unknown system is a fundamental problem in many fields of science and engineering, such as neuroscience, system identification, and reinforcement learning (RL). To tackle this problem, a variety of approximate solutions—primarily originating from RL—have been proposed. While these methods have found empirical success, the problem of exploration has been traditionally formulated as an episodic and offline problem. This is not the case in neuroscience where exploration must be done online in a non-episodic manner. Moreover, the environments in RL are primarily input-driven while the intrinsic dynamics of neural systems are strong and the strength and frequency of the actions the agent can take are limited. Surprisingly, the problem of online exploration has been relatively unexplored by the RL community. To tackle this, we introduce Escape-the-Maze (EtM), a model-based RL algorithm that combines a simple exploration bonus with an uncertainty-aware dynamics model. We prove that this combination leads to a synergistic relationship between the agent minimizing its uncertainty over the dynamics model while simultaneously attempting to explore the state-space. We empirically validate the proposed approach against previously proposed exploration-based deep RL methods and find that not only is the proposed approach superior but many of the previously proposed approaches are not well-suited for this problem.
Tuesday, April 5, 2022