Modeling behavioral experiments on uncertainty and cooperation with population-based reinforcement learning

Elias Fernández Domingos, Jelena Grujić, Juan C. Burguillo, Francisco C. Santos, Tom Lenaerts

Onderzoeksoutput: Articlepeer review

7 Citaten (Scopus)
102 Downloads (Pure)


From climate action to public health measures, human collective endeavors are often shaped by different uncertainties. Here we introduce a novel population-based learning model wherein a group of individuals facing a collective risk dilemma acquire their strategies over time through reinforcement learning, while handling different sources of uncertainty. In such an N-person collective risk dilemma players make step-wise contributions to avoid a catastrophe that would result in a loss of wealth for all players. Success is attained if they collectively reach a certain contribution level over time, or, when the threshold is not reached, they were lucky enough to avoid the cataclysm. The dilemma lies in the trade-off between the proportion of personal contributions that players wish to give to collectively reach the goal and the remainder of the wealth they can keep at the end of the game. We show that the strategies learned with the model correspond to those experimentally observed, even when there is uncertainty about either the risk of failing when the goal is not reached, the magnitude of the threshold to attain and the time available to reach the target. We furthermore confirm that being unsure about the time-window favors more extreme reactions and polarization, diminishing the number of agents that contribute fairly. The population-based on-line learning framework we propose is general enough to be applicable in a wide range of collective action problems and arbitrarily large sets of available policies.

Originele taal-2English
TijdschriftSimulation Modelling Practice and Theory
StatusPublished - mei 2021


Duik in de onderzoeksthema's van 'Modeling behavioral experiments on uncertainty and cooperation with population-based reinforcement learning'. Samen vormen ze een unieke vingerafdruk.

Citeer dit