When looking out into the real world, it is clear that many scenarios contain multiple agents that are trying to accomplish various objectives. In order to deploy autonomous agents that can provide added value in such environments, it is therefore imperative that we gain a clear understanding of decision making in them. In this work we study such systems, known as multi-objective multi-agent systems. Concretely, we look at Multi-Objective Normal-Form Games (MONFGs), which are deterministic stateless games in which the agents receive a vectorial payoff relating to the range of objectives, rather than a single scalar reward, based on their joint-actions. For the first contributions of this thesis, we take a theoretical approach and prove five novel properties relating to Nash equilibria in MONFGs that were not previously considered in the literature. For our second contribution, we take a reinforcement learning approach and study agents who are unaware of the dynamics of the underlying game. These agents must learn to coordinate their strategies by repeatedly playing a base MONFG. For this purpose, we design a collection of novel learning algorithms, allowing agents to communicate preferred actions or strategies. We provide algorithms both for cooperative as well as self-interested agents and perform an extensive empirical validation of them. We find that agents in cooperative settings receive a moderate boost in learning rates when using communication compared to simple independent learners without communication. This result appears consistent with work in single-objective reinforcement learning. In our self-interested settings, we further demonstrate the first emergence of cyclic Nash equilibria in repeated MONFGs. We also study in detail whether agents in our set of benchmark games benefit from these novel communication approaches compared to simple independent learning. We find that in games with Nash equilibria, agents appear indifferent to communication as the benefits are not substantial enough in the limited MONFGs that we consider. On the other hand, we identify that in games without Nash equilibria, some level of communication does benefit the agents. We attribute this to the fact that it helps them coordinate on converging to a compromise strategy.
- Multi-agent Learning
- multi-objective optimization
- Reinforcement Learning
- game theory
Communication In Multi-Objective Games: A Mixed Game-Theoretic and Reinforcement Learning Approach To Multi-Objective Normal-Form Games
Röpke, W. ((PhD) Student). 2021
Student thesis: Master's Thesis