Projecten per jaar
Samenvatting
We propose a novel framework to controller design in environments with a two-level structure: a known high-level graph (''map'') in which each vertex is populated by a Markov decision process, called a ''room''. The framework ''separates concerns'' by using different design techniques for low- and high-level tasks. We apply reactive synthesis for high-level tasks: given a specification as a logical formula over the high-level graph and a collection of low-level policies obtained together with ''concise'' latent structures, we construct a ''planner'' that selects which low-level policy to apply in each room. We develop a reinforcement learning procedure to train low-level policies on latent structures, which unlike previous approaches, circumvents a model distillation step. We pair the policy with probably approximately correct guarantees on its performance and on the abstraction quality, and lift these guarantees to the high-level task. These formal guarantees are the main advantage of the framework. Other advantages include scalability (rooms are large and their dynamics are unknown) and reusability of low-level policies. We demonstrate feasibility in challenging case studies where an agent navigates environments with moving obstacles and visual inputs.
Originele taal-2 | English |
---|---|
Titel | Composing Reinforcement Learning Policies, with Formal Guarantees |
Plaats van productie | Detroit, MI, USA |
Uitgeverij | ACM |
Pagina's | 574-583 |
Aantal pagina's | 10 |
Volume | Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems |
ISBN van elektronische versie | 9798400714269 |
Status | Published - 5 jun. 2025 |
Projecten
- 2 Actief
-
VLAAI1: Vlaams Artificiële Intelligentie Onderzoeksprogramma (VAIOP) – tweede cyclus
1/01/24 → 31/12/28
Project: Toegepast
-
iBOF/21/027: DESCARTES - infectieziekten economie en artificiële intelligentie met garanties
Nowe, A., Hens, N. & Beutels, P.
1/01/21 → 31/12/26
Project: Fundamenteel
-
Controller Synthesis from Deep Reinforcement Learning Policies
Delgrange, F., Avni, G., Lukina, A., Schilling, C., Nowe, A. & Pérez, G. A., 4 mrt. 2025. 31 blz.Onderzoeksoutput: Unpublished paper
Open Access -
Controller Synthesis from Deep Reinforcement Learning Policies
Delgrange, F., Avni, G., Lukina, A., Schilling, C., Nowe, A. & Pérez, G. A., 20 okt. 2024, blz. 1-25. 25 blz.Onderzoeksoutput: Unpublished paper
Open Access -
Controller Synthesis from Deep Reinforcement Learning Policies
Delgrange, F., Avni, G., Lukina, A., Schilling, C., Nowe, A. & Pérez, G. A., 28 okt. 2024.Onderzoeksoutput: Unpublished paper
Open Access