Multi-objective reinforcement learning for guaranteeing alignment with multiple values

Manel Rodriguez-Soto, Roxana Radulescu, Juan A. Rodriguez-Aguilar, Maite Lopez-Sanchez, Ann Nowe

Research output: Unpublished contribution to conferenceUnpublished paper

Abstract

In this paper, we address the problem of ensuring that autonomous learning agents are in alignment with multiple moral values. Specifically, we present the theoretical principles and algorithmic tools necessary for creating an environment where an agent is assured of learning a behaviour (or policy) that corresponds to multiple moral values while striving to achieve its individual objective.
To address this value alignment problem, we adopt the Multi-Objective Reinforcement Learning framework and propose a novel algorithm that combines techniques from Multi-Objective Reinforcement Learning and Linear Programming. In addition to providing theoretical guarantees, we illustrate our value alignment process with an example involving an autonomous vehicle. Here, we demonstrate that the agent learns to behave in alignment with the ethical values of safety, achievement, and comfort. Additionally, we use a synthetic multi-objective environment generator to evaluate the computational costs associated with guaranteeing ethical learning in situations with an increasing numbers of values.
Original languageEnglish
Number of pages9
Publication statusPublished - May 2023
Event2023 Adaptive and Learning Agents Workshop at AAMAS - London, United Kingdom
Duration: 29 May 202330 May 2023
https://alaworkshop2023.github.io

Workshop

Workshop2023 Adaptive and Learning Agents Workshop at AAMAS
Abbreviated titleALA 2023
Country/TerritoryUnited Kingdom
CityLondon
Period29/05/2330/05/23
Internet address

Fingerprint

Dive into the research topics of 'Multi-objective reinforcement learning for guaranteeing alignment with multiple values'. Together they form a unique fingerprint.

Cite this