Projecten per jaar
Samenvatting
We present a novel model-based algorithm, Cooperative Prioritized Sweeping, for sample-efficient learning in large multi-agent Markov decision processes. Our approach leverages domain knowledge about the structure of the problem in the form of a dynamic decision network. Using this information, our method learns a model of the environment to determine which state-action pairs are the most likely in need to be updated, significantly increasing learning speed. Batch updates can then be performed which efficiently back-propagate knowledge throughout the value function. Our method outperforms the state-of-the-art sparse cooperative Q-learning and QMIX algorithms, both on the well-known SysAdmin benchmark, randomized environments and a fully-observable variation of the well-known firefighter benchmark from Dec-POMDP literature.
Originele taal-2 | English |
---|---|
Titel | Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2021 |
Uitgeverij | IFAAMAS |
Pagina's | 160-168 |
Aantal pagina's | 9 |
ISBN van elektronische versie | 9781713832621 |
DOI's | |
Status | Published - 2021 |
Evenement | The 20th International Conference on Autonomous Agents and Multiagent Systems - Virtual Duur: 3 mei 2021 → 7 mei 2021 https://aamas2021.soton.ac.uk/ |
Publicatie series
Naam | Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS |
---|---|
Volume | 1 |
ISSN van geprinte versie | 1548-8403 |
ISSN van elektronische versie | 1558-2914 |
Conference
Conference | The 20th International Conference on Autonomous Agents and Multiagent Systems |
---|---|
Verkorte titel | AAMAS 2021 |
Periode | 3/05/21 → 7/05/21 |
Internet adres |
Vingerafdruk
Duik in de onderzoeksthema's van 'Cooperative Prioritized Sweeping'. Samen vormen ze een unieke vingerafdruk.-
VLAAI1: Subsidie: Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen
1/07/19 → 31/12/23
Project: Toegepast
-
FWOSB87: Grootschalige multi-agent omgevingen beheersen met model- gebaseerde reinforcement learning
1/11/19 → 31/10/21
Project: Fundamenteel
-
FWOSB27: Robuuste Vlootsgewijde Reinforcement Learning
Nowe, A., Verstraeten, T. & Helsen, J.
1/01/17 → 31/12/20
Project: Fundamenteel