Sparse Interactions in Multi-Agent Reinforcement Learning

Yann-Michaël De Hauwere

Research output: Book/ReportBook


Reinforcement learning has already been widely used in unknown domains with a
high degree of uncertainty. Also for domains in which multiple agents are acting
together it is an interesting paradigm. In these domains however several
additional problems arise. Agents behave autonomously and might have conflicting
goals. A straightforward approach is to allow agents to always observe the state
information of the other agents, as well as their actions and rewards they
receive. This allows the agents to learn to reach equilibrium points in the
environment, but it comes at a high cost. Agents are learning in the joint
state-action space which considerably slows down the learning process.

In this dissertation we argue that in settings where the interactions between
agents are sparse, an efficient learning approach is to allow the agents to
learn individually and only take into account the other agents when necessary. In the
former case, agents are not influencing each other in a particular state. Hence,
the state transition function and the reward function are independent of the
state and action of any other agent acting in the environment. In this case, the
learning can be reduced to single agent reinforcement learning and the agent can
safely ignore the other agents in the environment. In the latter case, when this
independency requirement does not hold, we are dealing with a multi-agent
coordination problem and a multi-agent learning approach is required. A key
question is how to determine when interaction occurs.

We propose novel approaches which are capable of learning in which states such
sparse interactions occur and based on this information use either a single agent
approach or a multi-agent approach. The first algorithm, called 2Observe,
exploits spatial dependencies that exist in the joint state space to learn the
set of states in which sparse interactions occur. This approach is based on
generalised learning automata that can approximate these dependencies in the
state space. The second algorithm, called CQ-learning, uses the immediate reward
signal to determine the influence of other agents in certain states. By
performing statistical tests on these immediate rewards, the relevant state
information of other agents during sparse interactions can be determined. The
last algorithm, called FCQ-learning, extends on this idea, but also allows to
anticipate coordination issues, several timesteps before they actually occur
and as such dealing with the issue in a timely fashion. This is achieved by
performing the statistical tests on the sum of immediate and future rewards.

Finally, we also introduce some methods to generalise knowledge about
coordination problems and demonstrate how experience can be shared between agents and
environments using 2Observe and CQ-learning. These methods are the first in
their kind to provide knowledge transfer about coordination experience in
multi-agent systems.
Original languageEnglish
Place of PublicationBrussels
Number of pages238
ISBN (Print)978-90-5487-920-6
Publication statusPublished - 28 Jun 2011


  • Artificial Intelligence
  • Multi-agent Systems
  • Reinforcement Learning
  • Local Coordination

Fingerprint Dive into the research topics of 'Sparse Interactions in Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this