Multi-objective reinforcement learning for guaranteeing alignment with multiple values

Manel Rodriguez-Soto, Roxana Radulescu, Juan A. Rodriguez-Aguilar, Maite Lopez-Sanchez, Ann Nowe

Onderzoeksoutput: Unpublished paper

Samenvatting

In this paper, we address the problem of ensuring that autonomous learning agents are in alignment with multiple moral values. Specifically, we present the theoretical principles and algorithmic tools necessary for creating an environment where an agent is assured of learning a behaviour (or policy) that corresponds to multiple moral values while striving to achieve its individual objective.
To address this value alignment problem, we adopt the Multi-Objective Reinforcement Learning framework and propose a novel algorithm that combines techniques from Multi-Objective Reinforcement Learning and Linear Programming. In addition to providing theoretical guarantees, we illustrate our value alignment process with an example involving an autonomous vehicle. Here, we demonstrate that the agent learns to behave in alignment with the ethical values of safety, achievement, and comfort. Additionally, we use a synthetic multi-objective environment generator to evaluate the computational costs associated with guaranteeing ethical learning in situations with an increasing numbers of values.
Originele taal-2English
Aantal pagina's9
StatusPublished - mei 2023
Evenement2023 Adaptive and Learning Agents Workshop at AAMAS - London, United Kingdom
Duur: 29 mei 202330 mei 2023
https://alaworkshop2023.github.io

Workshop

Workshop2023 Adaptive and Learning Agents Workshop at AAMAS
Verkorte titelALA 2023
Land/RegioUnited Kingdom
StadLondon
Periode29/05/2330/05/23
Internet adres

Vingerafdruk

Duik in de onderzoeksthema's van 'Multi-objective reinforcement learning for guaranteeing alignment with multiple values'. Samen vormen ze een unieke vingerafdruk.

Citeer dit