Decision-theoretic planning for the expected scalarised returns

Conor F. Hayes, Diederik M. Roijers, Enda Howley, Patrick Mannion

Onderzoeksoutput: Conference paperResearch

2 Citaten (Scopus)

Samenvatting

In sequential multi-objective decision making (MODeM) settings, when the utility of a user is derived from a single execution of a policy, policies for the expected scalarised returns (ESR) criterion should be computed. In multi-objective settings, a user’s preferences over objectives, or utility function, may be unknown at the time of planning. When the utility function of a user is unknown, multi-policy methods are deployed to compute a set of optimal policies. However, the state-of-the-art sequential MODeM multi-policy algorithms compute a set of optimal policies for the scalarised expected returns (SER) criterion. Algorithms that compute a set of optimal policies for the SER criterion utilise expected value vectors which cannot be used when optimising for the ESR criterion. We propose multi-objective distributional value iteration (MODVI) that replaces value vectors with distributions over the returns and computes a set of optimal policies for the ESR criterion.
Originele taal-2English
TitelProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems
UitgeverijInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pagina's1621-1623
Aantal pagina's3
ISBN van elektronische versie9781713854333
StatusPublished - 9 mei 2022
EvenementThe 21th International Conference on Autonomous Agents and Multiagent Systems - Online
Duur: 9 mei 202213 mei 2022
https://aamas2022-conference.auckland.ac.nz/

Publicatie series

NaamProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Volume3
ISSN van geprinte versie1548-8403
ISSN van elektronische versie1558-2914

Conference

ConferenceThe 21th International Conference on Autonomous Agents and Multiagent Systems
Verkorte titelAAMAS 2022
Periode9/05/2213/05/22
Internet adres

Bibliografische nota

Funding Information:
Conor F. Hayes is funded by the National University of Ireland Hardiman Scholarship. This research was supported by funding from the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” program.

Publisher Copyright:
© 2022 International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved.

Copyright:
Copyright 2022 Elsevier B.V., All rights reserved.

Citeer dit