TY - GEN
T1 - Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems
AU - Bargiacchi, Eugenio
AU - Verstraeten, Timothy
AU - Roijers, Diederik
AU - Nowe, Ann
AU - van Hasselt, Hado
N1 - Conference code: 35th
PY - 2018
Y1 - 2018
N2 - Learning to coordinate between multiple agents is an important problem in many reinforcement learning problems. Key to learning to coordinate is exploiting loose couplings, i.e., conditional independences between agents. In this paper we study learning in repeated fully cooperative games, multi-agent multi-armed bandits (MAMABs), in which the expected rewards can be expressed as a coordination graph. We propose multi-agent upper confidence exploration (MAUCE), a new algorithm for MAMABs that exploits loose couplings, which enables us to prove a regret bound that is logarithmic in the number of arm pulls and only linear in the number of agents. We empirically compare MAUCE to sparse cooperative Q-learning, and a state-of-the-art combinatorial bandit approach, and show that it performs much better on a variety of settings, including learning control policies for wind farms.
AB - Learning to coordinate between multiple agents is an important problem in many reinforcement learning problems. Key to learning to coordinate is exploiting loose couplings, i.e., conditional independences between agents. In this paper we study learning in repeated fully cooperative games, multi-agent multi-armed bandits (MAMABs), in which the expected rewards can be expressed as a coordination graph. We propose multi-agent upper confidence exploration (MAUCE), a new algorithm for MAMABs that exploits loose couplings, which enables us to prove a regret bound that is logarithmic in the number of arm pulls and only linear in the number of agents. We empirically compare MAUCE to sparse cooperative Q-learning, and a state-of-the-art combinatorial bandit approach, and show that it performs much better on a variety of settings, including learning control policies for wind farms.
UR - http://www.scopus.com/inward/record.url?scp=85057228542&partnerID=8YFLogxK
M3 - Conference paper
VL - 2
SP - 810
EP - 818
BT - 35th International Conference on Machine Learning, ICML 2018
A2 - Dy, Jennifer
A2 - Krause, Andreas
T2 - International Conference on Machine Learning 2018
Y2 - 10 July 2018 through 15 July 2018
ER -