Projects per year
Reinforcement Learning (RL) enables artificial agents to learn through direct interaction with the environment. However, it usually does not scale up well to large problems due to its sampling inefficiency. Reward Shaping is a well-established approach that allows for more efficient learning by incorporating domain knowledge in RL agents via supplementary rewards. In this work we propose a novel methodology that automatically generates reward shaping functions from user-provided Linear Temporal Logic on finite traces (LTLf) formulas. LTLf in our work serves as a rich language that allows the user to communicate domain knowledge to the learning agent. In both single and multi-agent settings, we demonstrate that our approach performs at least as well as the baseline approach while providing essential advantages in terms of flexibility and ease of use. We elaborate on some of these advantages empirically by demonstrating that our approach can handle domain knowledge with different levels of accuracy, and provides the user with the flexibility to express aspects of uncertainty in the provided advice.
|Number of pages||17|
|Journal||Neural Computing & Applications|
|Early online date||7 Jun 2022|
|Publication status||Published - 7 Jun 2022|
- Reinforcement Learning
- Reward Shaping
- Linear Temporal Logic on finite traces
- Multi-agent Systems
FingerprintDive into the research topics of 'A Framework for Flexibly Guiding Learning Agents'. Together they form a unique fingerprint.
- 1 Active
VLAAI1: Subsidie: Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen
1/07/19 → 31/12/23