Abstract

Physical systems are progressively moving forward from local controllers towards smarter cloud-based architectures. This allows similar inter-connected reinforcement learning agents to share information in order to obtain a more global perspective on the control task at hand. However, the local context and inherent properties of these agents are in practice not identical, making the approach of naively combining gathered information unsuitable. We propose to detect correlations between the observed dynamics of similar agents through dependent Gaussian processes, allowing us to effectively share information between these agents. We validate our approach in a pendulum swing-up and cart-pole setting. Our approach significantly outperforms the naive method of combining all samples into one model, by quickly and accurately estimating dependencies. In future work, we expect to improve our results by measuring correlations between rewards.
Original languageEnglish
Media of outputPoster
Number of pages6
Publication statusPublished - 9 Dec 2016

Cite this