Distributed Reinforcement Learning for Cooperative Multi-Robot Object Manipulation

Guohui Ding, Joewie Koh, Kelly Merckaert, Bram Vanderborght, Marco Nicotra, Christoffer Heckman, Alessandro Roncone, Lijun Chen

Research output: Chapter in Book/Report/Conference proceedingConference paper

10 Citations (Scopus)

Abstract

We consider solving a cooperative multi-robot object manipulation task using reinforcement learning (RL). We propose two distributed multi-agent RL approaches: distributed approximate RL (DA-RL), where each agent applies Q-learning with individual reward functions; and game-theoretic RL (GT-RL), where the agents update their Q-values based on the Nash equilibrium of a bimatrix Q-value game. We validate the proposed approaches in the setting of cooperative object manipulation with two simulated robot arms. Although we focus on a small system of two agents in this paper, both DA-RL and GT-RL apply to general multi-agent systems, and are expected to scale well to large systems.
Original languageEnglish
Title of host publicationInternational Conference on Autonomous Agents and Multi-Agent Systems
EditorsBo An, Amal El Fallah Seghrouchni, Gita Sukthankar
PublisherACM
Pages1831-1833
Number of pages3
ISBN (Electronic)978-1-4503-7518-4
Publication statusPublished - 1 Jan 2020
EventThe 19th International Conference on Autonomous Agents and Multi-Agent Systems - Auckland, New Zealand
Duration: 9 May 202013 May 2020
Conference number: 19
https://aamas2020.conference.auckland.ac.nz/
https://aamas2020.conference.auckland.ac.nz

Publication series

NameProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Volume2020-May
ISSN (Print)1548-8403
ISSN (Electronic)1558-2914

Conference

ConferenceThe 19th International Conference on Autonomous Agents and Multi-Agent Systems
Abbreviated titleAAMAS 2020
Country/TerritoryNew Zealand
CityAuckland
Period9/05/2013/05/20
Internet address

Fingerprint

Dive into the research topics of 'Distributed Reinforcement Learning for Cooperative Multi-Robot Object Manipulation'. Together they form a unique fingerprint.

Cite this