Reward shaping is a commonly used approach in single agent reinforcement learning to speed up the learning process. Potential based reward shaping has recently found its way to improve the performance of multi-agent reinforcement learning. Both in single and multi-agent settings these speedups are achieved without losing any theoretical convergence guarantees. This paper describes the use of context aware potential functions in a loosely coupled multi-agent system. In some multi-agent settings, the interactions between the agents only occur sporadically, in certain regions of the state space. It is clear that if speedups through reward shaping are to be achieved, that a different shaping signal should be used in these di?erent regions. We demonstrate how this can be achieved within FCQ-learning, which is an algorithm capable of automatically detecting when agents should take each other into consideration. Coordination problems can even be anticipated before the actual problems occur.
|Name||Proceedings of the Adaptive and Learning Agents Workshop|
|Conference||the Adaptive and Learning Agents Workshop|
|Abbreviated title||AAMAS 2012|
|Period||5/06/12 → …|
- Sparse Interactions
- Reward Shaping