Maximum Entropy Bayesian Actor Critic

Onderzoeksoutput: Poster

26 Downloads (Pure)

Samenvatting

Maximum Entropy Bayesian Actor Critic aims to avoid the problems of sample inefficiency and convergence brittleness by combining Bayesian and Maximum Entropy Reinforcement Learning. A Soft Policy Gradient is derived from a soft value function that introduces an entropy-weighted term to the policy gradient theorem. A Max Entropy Parameter Update alters the Bayesian actor critic parameter update rule to account for the additional policy entropy term.
Originele taal-2English
StatusPublished - 7 nov 2019
EvenementBNAIC 2019 - Brussels, Belgium
Duur: 7 nov 20198 nov 2019

Conference

ConferenceBNAIC 2019
Land/RegioBelgium
StadBrussels
Periode7/11/198/11/19

Vingerafdruk

Duik in de onderzoeksthema's van 'Maximum Entropy Bayesian Actor Critic'. Samen vormen ze een unieke vingerafdruk.
  • Maximum Entropy Bayesian Actor Critic

    Homer, S. T., 6 nov 2019, BNAIC/Benelearn 2019. Vol. 2491. 12 blz. (CEUR Workshop Proceedings).

    Onderzoeksoutput: Conference paperResearch

Citeer dit