Scalar reward is not enough: A response to Silver, Singh, Precup and Sutton (2021)

Peter Vamplew, Benjamin J. Smith, Johan Källström, Gabriel De Oliveira Ramos, Roxana Radulescu, Diederik M. Roijers, Conor F. Hayes, Fredrik Heintz, Patrick Mannion, Pieter Libin, Richard Dazeley, Cameron Foale

Onderzoeksoutput: Articlepeer review

5 Citaten (Scopus)
15 Downloads (Pure)

Samenvatting

The recent paper "Reward is Enough" by Silver, Singh, Precup and Sutton posits that the concept of reward maximisation is sufficient to underpin all intelligence, both natural and artificial. We contest the underlying assumption of Silver et al. that such reward can be scalar-valued. In this paper we explain why scalar rewards are insufficient to account for some aspects of both biological and computational intelligence, and argue in favour of explicitly multi-objective models of reward maximisation. Furthermore, we contend that even if scalar reward functions can trigger intelligent behaviour in specific cases, it is still undesirable to use this approach for the development of artificial general intelligence due to unacceptable risks of unsafe or unethical behaviour.
Originele taal-2English
Artikelnummer41
Pagina's (van-tot)1-19
Aantal pagina's19
TijdschriftAutonomous Agents and Multi-Agent Systems
Volume36
Nummer van het tijdschrift2
DOI's
StatusPublished - 16 jul 2022

Bibliografische nota

Publisher Copyright:
© 2022, The Author(s).

Vingerafdruk

Duik in de onderzoeksthema's van 'Scalar reward is not enough: A response to Silver, Singh, Precup and Sutton (2021)'. Samen vormen ze een unieke vingerafdruk.

Citeer dit